[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-2700: -- Fix Version/s: (was: 4.1) 4.0 Assignee: Yonik Seeley transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Yonik Seeley Assignee: Yonik Seeley Fix For: 4.0 Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-2700: -- Component/s: SolrCloud transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-2700: - Attachment: SOLR-2700.patch Serialize the strings to a meta file (.tlm = transaction log meta) before the add is complete transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch OK, I think we're getting close to committing now. Among other things, this latest version adds the abstract UpdateLog class with a NullUpdateLog and FSUpdateLog subclasses, adds an updateHandler/updateLog section solrconfig.xml, and allows one to specify the log directory. Currently the default is NullUpdateLog. transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch Here's an update that uses a fixed set of external strings in the javabin codec to ret and avoid repeating all of the field names in the logs. This drops the indexing penalty to 28% slower in this specific test, and decreases the transaction log size to 974M. transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch Patch that updates to trunk and comments out the prints (those were actually causing test failures for some reason...) {code} [junit] Testsuite: org.apache.solr.update.Batch-With-Multiple-Tests [junit] Testcase: org.apache.solr.update.Batch-With-Multiple-Tests:testDistribSearch: Caused an ERROR [junit] Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. [junit] junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. [junit] at java.lang.Thread.run(Thread.java:680) {code} transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch Here's the latest prototype patch. I've hit a bit of an oddity with locking though that causes TestRealTimeGet to hang. I put a ReentrantLock around the commit in the update hander. The test hangs with one or more of the writer threads blocked on the .lock(). - .unlock is called in a finally block - so it should always get called - I added a counter that is incremented after the lock and decremented after the unlock. it shows 0 in the debugger after the hang, meaning that we unlocked as many times as we locked. - the *only* place that touches that lock is DUH2.commit() - if I look into the Sync object inside the ReentrantLock, the state is 1 (meaning locked I think). The exclusiveOwnerThread is main for some reason. - I think what I am seeing is that unlock() seems to normally fail to take effect. The normal course is that cleanIndex() causes the main thread to do a deleteByQuery + commit, and even though the print says the lock was released, main retains it and no one else can ever acquire it. I can see the output via intellij, but not from the command line (since output seems to be buffered until the end of the test). transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch Here's an update that handles delete-by-id and also makes lookups concurrent (no synchronization on the file reads so multiple can proceed at once). transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Updated] (SOLR-2700) transaction logging
Just a casual comment.. This issue marks another big milestone in solr/lucene evolution, it moves into new direction of being not only search library, but rather full data storage/manipulation solution. Who needs sql and nosql db-s, they cannot search without painful integration :) Imo, this issue is symbolically just as important for us users as flex indexing was. Flex Indexing and column stride fields are great infrastructure to build upon, but they also started with one small step by making omitTf hack :) Mike is great with his progress, not perfection cheers, eks On Sun, Aug 7, 2011 at 7:44 PM, Yonik Seeley (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch Here's an update that handles delete-by-id and also makes lookups concurrent (no synchronization on the file reads so multiple can proceed at once). transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch, SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Updated] (SOLR-2700) transaction logging
On Sun, Aug 7, 2011 at 2:49 PM, eks dev eks...@yahoo.co.uk wrote: Just a casual comment.. This issue marks another big milestone in solr/lucene evolution, it moves into new direction of being not only search library, but rather full data storage/manipulation solution. Who needs sql and nosql db-s, they cannot search without painful integration :) Imo, this issue is symbolically just as important for us users as flex indexing was. Yep! solr = nosql + search -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2700) transaction logging
[ https://issues.apache.org/jira/browse/SOLR-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2700: --- Attachment: SOLR-2700.patch Here's a draft patch. There is a tlog.number file created for each commit. The javabin format is used to serialize SolrInputDocuments. An in-memory map of pointers into the log is kept for documents not yet soft-committed, and the realtime-get component checks that first before using SolrCore.getNewestSearcher(). Seems to work for getting documents not in the newest searcher so far. Tons of stuff left to do - the tlog files are currently in the CWD - need to handle deletes - need to handle flushes in a performant way - need to implement optional fsync for durability on power-failure - would be nice to make some of this multi-threaded for better performance - need to implement durability (apply updates from logs on startup) - need to implement some form of cleanup for transaction logs transaction logging --- Key: SOLR-2700 URL: https://issues.apache.org/jira/browse/SOLR-2700 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Attachments: SOLR-2700.patch A transaction log is needed for durability of updates, for a more performant realtime-get, and for replaying updates to recovering peers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org