[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-23 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665171#comment-13665171
 ] 

Shai Erera commented on LUCENE-4975:


Shyam, checkout this blog post 
(http://shaierera.blogspot.com/2013/05/the-replicator.html) which explains how 
the Replicator works and includes some example code. The javadocs also contain 
example docs. If you run into any issues, don't hesitate to email 
java-u...@lucene.apache.org.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-23 Thread Shyam V S (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665156#comment-13665156
 ] 

Shyam V S commented on LUCENE-4975:
---

Shair Erera,
If I want to try out this feature, how and where should I start? I'm planning 
to try out in a Master + 2 slaves lucene(integrted with hibernate) setup.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-13 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655915#comment-13655915
 ] 

Commit Tag Bot commented on LUCENE-4975:


[trunk commit] shaie
http://svn.apache.org/viewvc?view=revision&revision=1481804

LUCENE-4975: Add Replication module to Lucene

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-10 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654829#comment-13654829
 ] 

Michael McCandless commented on LUCENE-4975:


+1, I looked at the replication handlers and they look great!

I wonder if we could factor out touchIndex to a static method and share from 
IndexReplicationHandler and IndexAndTaxoReplicationHandler?

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-10 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653896#comment-13653896
 ] 

Shai Erera commented on LUCENE-4975:


I ran both tests w/ tests.iters=1000 and they passed. This gives me more 
confidence about the robustness of these two handlers. Still, other machines 
can dig up "special" seeds :).

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-08 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651850#comment-13651850
 ] 

Shai Erera commented on LUCENE-4975:


bq. The bug is that IndexWriter.close() waits for merges and commits. Lets quit 
kidding ourselves ok?

Not sure I agree .. the bug is real, and if somebody did new 
IW().commit().close() without making any change, he might be surprised about 
that commit too, and also IW would overwrite any existing file. The only thing 
"close-not-committing" would solve in this case is that I won't need to call 
rollback(), but close(). The bug won't disappear.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-08 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651847#comment-13651847
 ] 

Robert Muir commented on LUCENE-4975:
-

{quote}
I chatted about this with Mike and he confirmed my reasoning. This is very slim 
chance, and usually indicates a truly bad (or crazy) IO subsystem (i.e. not 
like MDW throwing random IOEs on opening the same file over and over again). I 
think perhaps this can be solved in IW by having it refer to the latest commit 
point read by IFD and not what it read. This seems safe to me, but perhaps an 
overkill. Anyway, it belongs in a different issue.

What also happened here is that IW overwrote segments_a (violating write-once 
policy), which MDW didn't catch because for replicator tests I need to turn off 
preventDoubleWrite. Mike also says that IW doesn't guarantee write-once if it 
hits exceptions ...
{quote}

Sorry, i disagree and am against this.

The bug is that IndexWriter.close() waits for merges and commits. Lets quit 
kidding ourselves ok?

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-08 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651839#comment-13651839
 ] 

Shai Erera commented on LUCENE-4975:


bq. I'm trying really hard not to say anything sarcastic here

Heh, I expected something from you :).

I chatted about this with Mike and he confirmed my reasoning. This is very slim 
chance, and usually indicates a truly bad (or crazy) IO subsystem (i.e. not 
like MDW throwing random IOEs on opening the same file over and over again). I 
think perhaps this can be solved in IW by having it refer to the latest commit 
point read by IFD and not what it read. This seems safe to me, but perhaps an 
overkill. Anyway, it belongs in a different issue.

What also happened here is that IW overwrote segments_a (violating write-once 
policy), which MDW didn't catch because for replicator tests I need to turn off 
preventDoubleWrite. Mike also says that IW doesn't guarantee write-once if it 
hits exceptions ...

So I think the safest solution is to deleteUnused() + rollback(), as anyway the 
handler must ensure that commits are not created by this "kiss".

I will resolve the remaining nocommits and post a new patch.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-08 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651786#comment-13651786
 ] 

Robert Muir commented on LUCENE-4975:
-

I'm trying really hard not to say anything sarcastic here :)

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-08 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651762#comment-13651762
 ] 

Shai Erera commented on LUCENE-4975:


Ok some more insights ... I think this additional chain of events occurs:

* IW's ctor reads gen 9, and SegmentInfos.getSegmentsFileName returns 
segments_9.
* IFD then successfully reads both segments_9 and segments_a, ending up w/ two 
commit points.
* IFD sorts them and passes to IndexDeletionPolicy (KeepLastCommit) which 
deletes segments_9 and keeps segments_a
* IFD marks that startingCommitDeleted, as it is

And here's what I still don't understand -- for some reason, IW creates a new 
commit point, segments_a, with the commitData from segments_9. Still need to 
dig into that.

In the meanwhile, I made the above change to the hanlder (to rollback(), not 
close()), and 430 iterations passed. Not sure if that's the right way to go ... 
if IW.close() didn't commit ... ;)

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-07 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650731#comment-13650731
 ] 

Shai Erera commented on LUCENE-4975:


I'm having reservations about creating a replicator/facet module which contains 
2 classes ... Maybe we should proceed with the code as-is, and then refactor if 
it creates a problem, or the module grows? Perhaps the breakout won't be to 
replicator/common and replicator/facet but to replicator/infra (or common) and 
replicator/extras which will serve like a catchall for other modules too (e.g. 
facet, suggest).

Another way is to break out replicator to common and framework/infra/impl such 
that common contains only whatever other modules require to compile against 
(i.e Revision, ReplicationHandler, maybe Replicator). Then we can add the facet 
replication code to facet/ with a dependency on replicator/common.

But really, I think we should just get it in and start to work with it, have 
deeper reviews and refactor as we go.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649731#comment-13649731
 ] 

Shai Erera commented on LUCENE-4975:


I think that's not a bad idea! replicator/common will include the interfaces 
(Revision and ReplicationHandler) + the framework impl and also 
IndexRevision/Handler. replicator/facet will include the taxonomy parts and 
depend on replicator/common and facet.

I can also move the facet related code under oal.replicator.facet and then 
suppress the Lucene3x codec for just these tests.

If others agree, I'll make the changes (mostly build.xml changes).

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649727#comment-13649727
 ] 

Adrien Grand commented on LUCENE-4975:
--

bq. Then maybe we could have sub-modules for specific replication strategies?

To make my point a little clearer, I was suggesting something pretty much like 
the analysis module: analyzers that require additional dependencies (such as 
icu or morfologik) are in their own sub-module so that you don't need to pull 
the ICU or Morfologik JARs if you just want to use LetterTokenizer (which is in 
lucene/analysis/common).

Likewise, we could have the interface and the logic to replicate simple (no 
sidecar data) indexes in lucene/replicator/common and have sub-modules for 
facet (lucene/replicator/facet) or suggesters (lucene/replicator/suggesters).

This may look overkill but at least this would help us keep dependencies clean 
between modules.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649674#comment-13649674
 ] 

Shai Erera commented on LUCENE-4975:


Ok, so there are 3 options I see: (1) have Replicator depend on Facet (and in 
the future on other modules), (2) have Facet depend on Replicator and (3) move 
Revision and ReplicationHandler (interfaces) someplace else, core or a new 
module we call 'commons' and Replicator and Facet depend on it. Tests though 
will need to depend on replicator though, since they need ReplicationClient.

BTW, the jetty dependencies are tests only, but I don't know how to make ivy 
resolve the dependencies just for tests. The only thing replicator depends on 
is servlet-api, for ReplicationService and httpclient for ReplicationClient. I 
think these need to remain in the module ...

If we made Facet depend on Replicator (I'm not totally against it), would that 
require you to have lucene-replicator.jar on the classpath, even if you don't 
use replication? If not, then perhaps this dependency isn't so bad ... it's 
just a compile-time dependency. Tests will still need to depend on replicator 
for runtime, but that's ok I think.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649658#comment-13649658
 ] 

Robert Muir commented on LUCENE-4975:
-

I still haven't had a change to look at the patch: but it sounds like some work 
needs to be done here to prevent dll hell.

having replicator depend upon all sidecar modules is a no-go.

it sounds like an interface is missing.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649655#comment-13649655
 ] 

Adrien Grand commented on LUCENE-4975:
--

Then maybe we could have sub-modules for specific replication strategies? 
lucene/replicator would only know how to handle raw indexes, while 
lucene/replicator/facets or lucene/replicator/suggest would implement custom 
logic?

This way lucene/facet wouldn't need to pull all lucene/replicator transitive 
dependencies, and lucene/replicator wouldn't depend on any lucene module but 
lucene/core.


> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649650#comment-13649650
 ] 

Shai Erera commented on LUCENE-4975:


As I said, arguments can be made both ways ... I don't know what's the best way 
here. I can see your point, but I don't feel good about having facet depend on 
replicator. I see Replicator as a higher-level service that besides providing 
the replication framework, also comes pre-built for replicating Lucene stuff. I 
don't mind seeing it grow to accommodate other Revision types in the future. 
For example, IndexAndTaxonomyRevision is just an example for replicating 
multiple indexes together. It can easily be duplicated to replicate few indexes 
at once, e.g. a MultiIndexRevision. Where would that object be? Cannot be in 
core, so why should IndexAndTaxo be in facet?

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649643#comment-13649643
 ] 

Robert Muir commented on LUCENE-4975:
-

{quote}
Even in the future, I would imagine that if we added support for replicating a 
suggester files, then it would make sense to put a dependency between 
replicator and suggester, rather than the other way around.
{quote}

Wait: how does this make sense?!

It should be the other way around: if suggester has a sidecar it needs special 
logic for replication. 

It does not need faceting.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649635#comment-13649635
 ] 

Adrien Grand commented on LUCENE-4975:
--

Good points, you convinced me. :-)

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch, LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649631#comment-13649631
 ] 

Shai Erera commented on LUCENE-4975:


I've been wondering about that too, but chose to keep the facet replication 
code under replicator for few reasons:

* A Revision contains files from multiple sources, and the taxonomy index is 
partly responsible for that. And ReplicationClient respects that -- so I guess 
it's not entirely true that the Replicator is unaware of taxonomy (even though 
it would still work if I pulled the taxonomy stuff out of it).

* I think it makes less sense to require lucene-replicator.jar for every 
faceted search app which makes use of lucene-facet.jar. The key reason is that 
replicator requires few additional jars such as httpclient, httpcore, jetty, 
servlet-api. Requiring lucene-facet.jar seems less painful to me, than 
requiring every faceted search app out there to include all these jars even if 
it doesn't want to do replication.

* I like to keep things local to the module. There are many similarities 
between IndexAndTaxoRevision to IndexRevision (likewise for their handlers and 
tests). Therefore whenever I made change to one, I knew I should go make a 
similar change to the other.

All in all, I guess arguments can be made both ways, but I prefer for the now 
to keep things local to the replicator module. Even in the future, I would 
imagine that if we added support for replicating a suggester files, then it 
would make sense to put a dependency between replicator and suggester, rather 
than the other way around.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649613#comment-13649613
 ] 

Adrien Grand commented on LUCENE-4975:
--

+1 to commit too.

Looking at the code, there seems to be specialized implementations for faceting 
because of the need to replicate the taxonomy indexes too, so I was wondering 
that maybe this facet-specific code should be under lucene/facets rather than 
lucene/replicator so that lucene/replicator doesn't need to depend on all 
modules that have specific replication needs. (I'm not sure what the best 
option is yet, this can be addressed afterwards.)

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-06 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649564#comment-13649564
 ] 

Tommaso Teofili commented on LUCENE-4975:
-

+1 for committing it

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-05 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649457#comment-13649457
 ] 

Michael McCandless commented on LUCENE-4975:


+1 to commit and iterate from here on... this new module looks very nice!

I like the new testConsistencyOnException ... maybe also call 
MDW.setRandomIOExceptionRateOnOpen?  This will additionally randomly throw 
exceptions from openInput/createOutput.

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
> Attachments: LUCENE-4975.patch, LUCENE-4975.patch, LUCENE-4975.patch, 
> LUCENE-4975.patch
>
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4975) Add Replication module to Lucene

2013-05-02 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647655#comment-13647655
 ] 

Shai Erera commented on LUCENE-4975:


So here's an overview how the Replicator works (it's also document under 
oal.replicator.package.html):

At a high-level, producers (e.g. indexer) publish Revisions, and consumers 
update to the latest Revision available. Like SVN, if a client is on rev1 and 
the server has rev4, the next update request will upgrade the client to rev4, 
skipping all intermediate revisions.

The Replicator offers two implementations at the moment: LocalReplicator to be 
used by at the server side and HttpReplicator to be used by clients to e.g. 
update over HTTP. In the future, we may want to add other Replicator 
implementations, e.g. rsync, torrent... for HTTP, the package also provides a 
ReplicationService which acts on the Http servlet request/response following 
some API specification. In that sense, the HttpReplicator expects a certain 
HTTP impl on the server side, so ReplicationService helps you by implementation 
that API. The reason it's not a servlet is so that you can plug it into your 
application servlet freely.

A Revision is basically a list of files and sources. For example, IndexRevision 
contains the list of files in an IndexCommit (and only one source), while 
IndexAndTaxonomyRevision contains the list of files from both IndexCommits with 
corresponding sources (index/taxonomy). When the server publishes either of 
these two revision, the IndexCommits are snapshotted so that files aren't 
deleted, and the Replicator serves file requests (by clients) from the 
Revision. The Revision is also responsible for releasing itself -- this is done 
automatically by the Replicator which releases a revision when it's no longer 
needed (i.e. there's a new one already) and there are no clients that currently 
replicate its files.

On the client side, the package offers a ReplicationClient class which can be 
invoked either manually, or start its update-thread to periodically check for 
updates. The client is given a ReplicationHandler (two matching 
implementations: IndexReplicationHandler and 
IndexAndTaxonomyReplicationHandler) which is responsible to act on the 
replicated files. The client first obtains all needed files (i.e. those that 
the new Revision offers, and the client is still missing), and after they were 
all successfully copied over, the handler is invoked. Both handlers copy the 
files from their temporary location to the index directories, fsync them and 
kiss the index such that unused files are deleted. You can provide each handler 
a Callable which is invoked after the index has been safely and successfully 
updated, so you can e.g. searcherManager.maybeReopen().

Here's a general code example that explains how to work with the Replicator:

{code}
// ++ SERVER SIDE ++ // 
IndexWriter publishWriter; // the writer used for indexing
Replicator replicator = new LocalReplicator();
replicator.publish(new IndexRevision(publishWriter));

// ++ CLIENT SIDE ++ // 
// either LocalReplictor, or HttpReplicator if client and server are on 
different nodes
Replicator replicator;

// callback invoked after handler finished handling the revision and e.g. can 
reopen the reader.
Callable callback = null; // can also be null if no callback is 
needed
ReplicationHandler handler = new IndexReplicationHandler(indexDir, callback);
SourceDirectoryFactory factory = new PerSessionDirectoryFactory(workDir);
ReplicationClient client = new ReplicationClient(replicator, handler, factory);

// invoke client manually
client.updateNow();

// or, periodically
client.startUpdateThread(100); // check for update every 100 milliseconds
{code}

The package of course comes with unit tests, though I'm sure there's room for 
improvement (there always is!).

> Add Replication module to Lucene
> 
>
> Key: LUCENE-4975
> URL: https://issues.apache.org/jira/browse/LUCENE-4975
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Shai Erera
>Assignee: Shai Erera
>
> I wrote a replication module which I think will be useful to Lucene users who 
> want to replicate their indexes for e.g high-availability, taking hot backups 
> etc.
> I will upload a patch soon where I'll describe in general how it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org