[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665171#comment-13665171 ] Shai Erera commented on LUCENE-4975: Shyam, checkout this blog post (http://shaierera.blogspot.com/2013/05/the-replicator.html) which explains how the Replicator works and includes some example code. The javadocs also contain example docs. If you run into any issues, don't hesitate to email java-u...@lucene.apache.org. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Fix For: 5.0, 4.4 > > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665156#comment-13665156 ] Shyam V S commented on LUCENE-4975: --- Shair Erera, If I want to try out this feature, how and where should I start? I'm planning to try out in a Master + 2 slaves lucene(integrted with hibernate) setup. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Fix For: 5.0, 4.4 > > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655915#comment-13655915 ] Commit Tag Bot commented on LUCENE-4975: [trunk commit] shaie http://svn.apache.org/viewvc?view=revision&revision=1481804 LUCENE-4975: Add Replication module to Lucene > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654829#comment-13654829 ] Michael McCandless commented on LUCENE-4975: +1, I looked at the replication handlers and they look great! I wonder if we could factor out touchIndex to a static method and share from IndexReplicationHandler and IndexAndTaxoReplicationHandler? > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653896#comment-13653896 ] Shai Erera commented on LUCENE-4975: I ran both tests w/ tests.iters=1000 and they passed. This gives me more confidence about the robustness of these two handlers. Still, other machines can dig up "special" seeds :). > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651850#comment-13651850 ] Shai Erera commented on LUCENE-4975: bq. The bug is that IndexWriter.close() waits for merges and commits. Lets quit kidding ourselves ok? Not sure I agree .. the bug is real, and if somebody did new IW().commit().close() without making any change, he might be surprised about that commit too, and also IW would overwrite any existing file. The only thing "close-not-committing" would solve in this case is that I won't need to call rollback(), but close(). The bug won't disappear. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651847#comment-13651847 ] Robert Muir commented on LUCENE-4975: - {quote} I chatted about this with Mike and he confirmed my reasoning. This is very slim chance, and usually indicates a truly bad (or crazy) IO subsystem (i.e. not like MDW throwing random IOEs on opening the same file over and over again). I think perhaps this can be solved in IW by having it refer to the latest commit point read by IFD and not what it read. This seems safe to me, but perhaps an overkill. Anyway, it belongs in a different issue. What also happened here is that IW overwrote segments_a (violating write-once policy), which MDW didn't catch because for replicator tests I need to turn off preventDoubleWrite. Mike also says that IW doesn't guarantee write-once if it hits exceptions ... {quote} Sorry, i disagree and am against this. The bug is that IndexWriter.close() waits for merges and commits. Lets quit kidding ourselves ok? > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651839#comment-13651839 ] Shai Erera commented on LUCENE-4975: bq. I'm trying really hard not to say anything sarcastic here Heh, I expected something from you :). I chatted about this with Mike and he confirmed my reasoning. This is very slim chance, and usually indicates a truly bad (or crazy) IO subsystem (i.e. not like MDW throwing random IOEs on opening the same file over and over again). I think perhaps this can be solved in IW by having it refer to the latest commit point read by IFD and not what it read. This seems safe to me, but perhaps an overkill. Anyway, it belongs in a different issue. What also happened here is that IW overwrote segments_a (violating write-once policy), which MDW didn't catch because for replicator tests I need to turn off preventDoubleWrite. Mike also says that IW doesn't guarantee write-once if it hits exceptions ... So I think the safest solution is to deleteUnused() + rollback(), as anyway the handler must ensure that commits are not created by this "kiss". I will resolve the remaining nocommits and post a new patch. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651786#comment-13651786 ] Robert Muir commented on LUCENE-4975: - I'm trying really hard not to say anything sarcastic here :) > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651762#comment-13651762 ] Shai Erera commented on LUCENE-4975: Ok some more insights ... I think this additional chain of events occurs: * IW's ctor reads gen 9, and SegmentInfos.getSegmentsFileName returns segments_9. * IFD then successfully reads both segments_9 and segments_a, ending up w/ two commit points. * IFD sorts them and passes to IndexDeletionPolicy (KeepLastCommit) which deletes segments_9 and keeps segments_a * IFD marks that startingCommitDeleted, as it is And here's what I still don't understand -- for some reason, IW creates a new commit point, segments_a, with the commitData from segments_9. Still need to dig into that. In the meanwhile, I made the above change to the hanlder (to rollback(), not close()), and 430 iterations passed. Not sure if that's the right way to go ... if IW.close() didn't commit ... ;) > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650731#comment-13650731 ] Shai Erera commented on LUCENE-4975: I'm having reservations about creating a replicator/facet module which contains 2 classes ... Maybe we should proceed with the code as-is, and then refactor if it creates a problem, or the module grows? Perhaps the breakout won't be to replicator/common and replicator/facet but to replicator/infra (or common) and replicator/extras which will serve like a catchall for other modules too (e.g. facet, suggest). Another way is to break out replicator to common and framework/infra/impl such that common contains only whatever other modules require to compile against (i.e Revision, ReplicationHandler, maybe Replicator). Then we can add the facet replication code to facet/ with a dependency on replicator/common. But really, I think we should just get it in and start to work with it, have deeper reviews and refactor as we go. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649731#comment-13649731 ] Shai Erera commented on LUCENE-4975: I think that's not a bad idea! replicator/common will include the interfaces (Revision and ReplicationHandler) + the framework impl and also IndexRevision/Handler. replicator/facet will include the taxonomy parts and depend on replicator/common and facet. I can also move the facet related code under oal.replicator.facet and then suppress the Lucene3x codec for just these tests. If others agree, I'll make the changes (mostly build.xml changes). > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649727#comment-13649727 ] Adrien Grand commented on LUCENE-4975: -- bq. Then maybe we could have sub-modules for specific replication strategies? To make my point a little clearer, I was suggesting something pretty much like the analysis module: analyzers that require additional dependencies (such as icu or morfologik) are in their own sub-module so that you don't need to pull the ICU or Morfologik JARs if you just want to use LetterTokenizer (which is in lucene/analysis/common). Likewise, we could have the interface and the logic to replicate simple (no sidecar data) indexes in lucene/replicator/common and have sub-modules for facet (lucene/replicator/facet) or suggesters (lucene/replicator/suggesters). This may look overkill but at least this would help us keep dependencies clean between modules. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649674#comment-13649674 ] Shai Erera commented on LUCENE-4975: Ok, so there are 3 options I see: (1) have Replicator depend on Facet (and in the future on other modules), (2) have Facet depend on Replicator and (3) move Revision and ReplicationHandler (interfaces) someplace else, core or a new module we call 'commons' and Replicator and Facet depend on it. Tests though will need to depend on replicator though, since they need ReplicationClient. BTW, the jetty dependencies are tests only, but I don't know how to make ivy resolve the dependencies just for tests. The only thing replicator depends on is servlet-api, for ReplicationService and httpclient for ReplicationClient. I think these need to remain in the module ... If we made Facet depend on Replicator (I'm not totally against it), would that require you to have lucene-replicator.jar on the classpath, even if you don't use replication? If not, then perhaps this dependency isn't so bad ... it's just a compile-time dependency. Tests will still need to depend on replicator for runtime, but that's ok I think. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649658#comment-13649658 ] Robert Muir commented on LUCENE-4975: - I still haven't had a change to look at the patch: but it sounds like some work needs to be done here to prevent dll hell. having replicator depend upon all sidecar modules is a no-go. it sounds like an interface is missing. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649655#comment-13649655 ] Adrien Grand commented on LUCENE-4975: -- Then maybe we could have sub-modules for specific replication strategies? lucene/replicator would only know how to handle raw indexes, while lucene/replicator/facets or lucene/replicator/suggest would implement custom logic? This way lucene/facet wouldn't need to pull all lucene/replicator transitive dependencies, and lucene/replicator wouldn't depend on any lucene module but lucene/core. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649650#comment-13649650 ] Shai Erera commented on LUCENE-4975: As I said, arguments can be made both ways ... I don't know what's the best way here. I can see your point, but I don't feel good about having facet depend on replicator. I see Replicator as a higher-level service that besides providing the replication framework, also comes pre-built for replicating Lucene stuff. I don't mind seeing it grow to accommodate other Revision types in the future. For example, IndexAndTaxonomyRevision is just an example for replicating multiple indexes together. It can easily be duplicated to replicate few indexes at once, e.g. a MultiIndexRevision. Where would that object be? Cannot be in core, so why should IndexAndTaxo be in facet? > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649643#comment-13649643 ] Robert Muir commented on LUCENE-4975: - {quote} Even in the future, I would imagine that if we added support for replicating a suggester files, then it would make sense to put a dependency between replicator and suggester, rather than the other way around. {quote} Wait: how does this make sense?! It should be the other way around: if suggester has a sidecar it needs special logic for replication. It does not need faceting. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649635#comment-13649635 ] Adrien Grand commented on LUCENE-4975: -- Good points, you convinced me. :-) > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch, LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649631#comment-13649631 ] Shai Erera commented on LUCENE-4975: I've been wondering about that too, but chose to keep the facet replication code under replicator for few reasons: * A Revision contains files from multiple sources, and the taxonomy index is partly responsible for that. And ReplicationClient respects that -- so I guess it's not entirely true that the Replicator is unaware of taxonomy (even though it would still work if I pulled the taxonomy stuff out of it). * I think it makes less sense to require lucene-replicator.jar for every faceted search app which makes use of lucene-facet.jar. The key reason is that replicator requires few additional jars such as httpclient, httpcore, jetty, servlet-api. Requiring lucene-facet.jar seems less painful to me, than requiring every faceted search app out there to include all these jars even if it doesn't want to do replication. * I like to keep things local to the module. There are many similarities between IndexAndTaxoRevision to IndexRevision (likewise for their handlers and tests). Therefore whenever I made change to one, I knew I should go make a similar change to the other. All in all, I guess arguments can be made both ways, but I prefer for the now to keep things local to the replicator module. Even in the future, I would imagine that if we added support for replicating a suggester files, then it would make sense to put a dependency between replicator and suggester, rather than the other way around. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649613#comment-13649613 ] Adrien Grand commented on LUCENE-4975: -- +1 to commit too. Looking at the code, there seems to be specialized implementations for faceting because of the need to replicate the taxonomy indexes too, so I was wondering that maybe this facet-specific code should be under lucene/facets rather than lucene/replicator so that lucene/replicator doesn't need to depend on all modules that have specific replication needs. (I'm not sure what the best option is yet, this can be addressed afterwards.) > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649564#comment-13649564 ] Tommaso Teofili commented on LUCENE-4975: - +1 for committing it > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649457#comment-13649457 ] Michael McCandless commented on LUCENE-4975: +1 to commit and iterate from here on... this new module looks very nice! I like the new testConsistencyOnException ... maybe also call MDW.setRandomIOExceptionRateOnOpen? This will additionally randomly throw exceptions from openInput/createOutput. > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, > LUCENE-4975.patch > > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647655#comment-13647655 ] Shai Erera commented on LUCENE-4975: So here's an overview how the Replicator works (it's also document under oal.replicator.package.html): At a high-level, producers (e.g. indexer) publish Revisions, and consumers update to the latest Revision available. Like SVN, if a client is on rev1 and the server has rev4, the next update request will upgrade the client to rev4, skipping all intermediate revisions. The Replicator offers two implementations at the moment: LocalReplicator to be used by at the server side and HttpReplicator to be used by clients to e.g. update over HTTP. In the future, we may want to add other Replicator implementations, e.g. rsync, torrent... for HTTP, the package also provides a ReplicationService which acts on the Http servlet request/response following some API specification. In that sense, the HttpReplicator expects a certain HTTP impl on the server side, so ReplicationService helps you by implementation that API. The reason it's not a servlet is so that you can plug it into your application servlet freely. A Revision is basically a list of files and sources. For example, IndexRevision contains the list of files in an IndexCommit (and only one source), while IndexAndTaxonomyRevision contains the list of files from both IndexCommits with corresponding sources (index/taxonomy). When the server publishes either of these two revision, the IndexCommits are snapshotted so that files aren't deleted, and the Replicator serves file requests (by clients) from the Revision. The Revision is also responsible for releasing itself -- this is done automatically by the Replicator which releases a revision when it's no longer needed (i.e. there's a new one already) and there are no clients that currently replicate its files. On the client side, the package offers a ReplicationClient class which can be invoked either manually, or start its update-thread to periodically check for updates. The client is given a ReplicationHandler (two matching implementations: IndexReplicationHandler and IndexAndTaxonomyReplicationHandler) which is responsible to act on the replicated files. The client first obtains all needed files (i.e. those that the new Revision offers, and the client is still missing), and after they were all successfully copied over, the handler is invoked. Both handlers copy the files from their temporary location to the index directories, fsync them and kiss the index such that unused files are deleted. You can provide each handler a Callable which is invoked after the index has been safely and successfully updated, so you can e.g. searcherManager.maybeReopen(). Here's a general code example that explains how to work with the Replicator: {code} // ++ SERVER SIDE ++ // IndexWriter publishWriter; // the writer used for indexing Replicator replicator = new LocalReplicator(); replicator.publish(new IndexRevision(publishWriter)); // ++ CLIENT SIDE ++ // // either LocalReplictor, or HttpReplicator if client and server are on different nodes Replicator replicator; // callback invoked after handler finished handling the revision and e.g. can reopen the reader. Callablecallback = null; // can also be null if no callback is needed ReplicationHandler handler = new IndexReplicationHandler(indexDir, callback); SourceDirectoryFactory factory = new PerSessionDirectoryFactory(workDir); ReplicationClient client = new ReplicationClient(replicator, handler, factory); // invoke client manually client.updateNow(); // or, periodically client.startUpdateThread(100); // check for update every 100 milliseconds {code} The package of course comes with unit tests, though I'm sure there's room for improvement (there always is!). > Add Replication module to Lucene > > > Key: LUCENE-4975 > URL: https://issues.apache.org/jira/browse/LUCENE-4975 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Shai Erera >Assignee: Shai Erera > > I wrote a replication module which I think will be useful to Lucene users who > want to replicate their indexes for e.g high-availability, taking hot backups > etc. > I will upload a patch soon where I'll describe in general how it works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org