I uploaded a patch to LUCENE-3424 which implements sequence ids for
IW. Add, update and delete returns a long seqID for every operation
and commit returns the largest committed seq id.
When writing transaction logs or a journal (however you wanna call it)
- the biggest problem here is that in a
I agree: we should figure out just how an app would effectively make
use of this seq ID, in order to understand if this really is gonna
work end to end. Else we shouldn't change Lucene's core APIs.
EG: could ES remove its lock array if Lucene returned a seq ID? How
bad is it that
On Thu, Sep 8, 2011 at 5:35 PM, Yonik Seeley yo...@lucidimagination.com wrote:
On Thu, Sep 8, 2011 at 11:26 AM, Michael McCandless
luc...@mikemccandless.com wrote:
Returning a long seqID seems the least invasive change to make this
total ordering possible? Especially since the DWDQ already
I created LUCENE-3424 for this. But I still would like to keep the
discussion open here rather than moving this entirely to an issue.
There is more about this than only the seq. ids.
simon
On Thu, Sep 8, 2011 at 5:35 PM, Yonik Seeley yo...@lucidimagination.com wrote:
On Thu, Sep 8, 2011 at
On 09/09/2011 11:00, Simon Willnauer wrote:
I created LUCENE-3424 for this. But I still would like to keep the
discussion open here rather than moving this entirely to an issue.
There is more about this than only the seq. ids.
I'm concerned also about the content of the transaction log. In
+1
indeed! All possibilities are are needed.
One might do wild things if it is somehow typed. For example,
dictionary compression for fields that are tokenized (not only
stored), as we already have Term dictionary supporting ord-s. Keeping
just a map Token - ord with transaction log...
On
On 09/09/2011 12:07, eks dev wrote:
+1
indeed! All possibilities are are needed.
One might do wild things if it is somehow typed. For example,
dictionary compression for fields that are tokenized (not only
stored), as we already have Term dictionary supporting ord-s. Keeping
just a map Token-
I didn't think, it was just a spontaneous reaction :)
At the moment I am using static dictionaries to at least get a grip on
size of stored fields (escaping encoded terms)
Re: Global
Maybe the trick would be to somehow use term dictionary as it must be
*eventually* updated? An idea is to write
On 09/09/2011 13:20, eks dev wrote:
I didn't think, it was just a spontaneous reaction :)
At the moment I am using static dictionaries to at least get a grip on
size of stored fields (escaping encoded terms)
Re: Global
Maybe the trick would be to somehow use term dictionary as it must be
On Fri, Sep 9, 2011 at 11:19 AM, Andrzej Bialecki a...@getopt.org wrote:
On 09/09/2011 11:00, Simon Willnauer wrote:
I created LUCENE-3424 for this. But I still would like to keep the
discussion open here rather than moving this entirely to an issue.
There is more about this than only the
hey folks,
we already have transaction logging on Solr side so I should have
started this discussion earlier. However, I want to bring this up to
the list since I think this is a very valuable feature also for plain
Lucene users and eventually this should also be available to them. I
don't think
On 08/09/2011 11:35, Simon Willnauer wrote:
hey folks,
we already have transaction logging on Solr side so I should have
started this discussion earlier. However, I want to bring this up to
the list since I think this is a very valuable feature also for plain
Lucene users and eventually this
On Thu, Sep 8, 2011 at 5:35 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
I don't think this needs to be a core feature at all but I think we need
to provide the necessary hooks in Lucene core to make this reliable
and consistent.
I've thought about it a little - it would be really
The delete by query is solved by recording the primary / UID of the
document(s) deleted. It's only expensive if the transaction log
implementation is not designed properly. :)
On Thu, Sep 8, 2011 at 5:35 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
hey folks,
we already have
On Thu, Sep 8, 2011 at 4:21 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
The delete by query is solved by recording the primary / UID of the
document(s) deleted. It's only expensive if the transaction log
implementation is not designed properly. :)
phew I don't think this is
This isn't a new problem. Databases have been around for what, 30+ years?
On Thu, Sep 8, 2011 at 11:01 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
On Thu, Sep 8, 2011 at 4:21 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
The delete by query is solved by recording the
On Thu, Sep 8, 2011 at 2:54 PM, Yonik Seeley yo...@lucidimagination.com wrote:
On Thu, Sep 8, 2011 at 5:35 AM, Simon Willnauer
simon.willna...@googlemail.com wrote:
I don't think this needs to be a core feature at all but I think we need
to provide the necessary hooks in Lucene core to make
+1 for having a contrib/transactionlog that apps could use, outside of
Solr/ElasticSearch.
And it sounds like one cannot build such a thing unless one forces an
order above Lucene (like ElasticSearch), or, we make it possible to
see/control the order of ops inside IW?
Even ES's approach is
On Thu, Sep 8, 2011 at 11:26 AM, Michael McCandless
luc...@mikemccandless.com wrote:
Returning a long seqID seems the least invasive change to make this
total ordering possible? Especially since the DWDQ already computes
this order...
+1
This seems like the most powerful option.
-Yonik
19 matches
Mail list logo