[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012895#comment-13012895
 ] 

Michael McCandless commented on LUCENE-3001:


{quote}
The current ptoblem with NumericField is simply that you don't get one back 
when you do a Document.getField() on the search results. And this is why we did 
the toString() variant in 2.9.
{quote}

Right, this is long-standing limitation on our NF impl, so let's just fix it 
(store the bit); then we can use binary encoding, user can get NF back, etc.

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012566#comment-13012566
 ] 

Ryan McKinley commented on LUCENE-3001:
---

This is trivial now that lucene+solr share dev.

I'd will commit soon.

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012574#comment-13012574
 ] 

Ryan McKinley commented on LUCENE-3001:
---

Added to trunk in #1086651

I'll wait for 3.1 release -- and potentially more discussion -- to port to 3.1

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012644#comment-13012644
 ] 

Uwe Schindler commented on LUCENE-3001:
---

Hi, sorry I saw the commit but this is not in line with current Lucene:

- Lucene uses Numeric* not Trie* (trie is only used in solr, historically 
because there were other numeric fields before)
- This helper is somehow a ugly replacement for NumericField, the encoding 
should be in NumericField itsself, maybe using a binary stored type in its 
ctor.

So please revert this and put it into a Solr util! This commit was much too 
fast and without discussion.

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012660#comment-13012660
 ] 

Robert Muir commented on LUCENE-3001:
-

bq. The solr support for numeric fields writes the stored value as binary vs 
the lucene NumericField

Why is this done in two different ways? Can we fix it so we only have one 
numerics encoding?

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012667#comment-13012667
 ] 

Ryan McKinley commented on LUCENE-3001:
---

I backed this out of lucene for now.

It would be great to have a way to write numeric fields that are compatible 
with solr without solr dependencies.  (spatial, etc)

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012671#comment-13012671
 ] 

Yonik Seeley commented on LUCENE-3001:
--

I'd like to keep Solr's binary encoding.
Perhaps Lucene should switch to that?

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012680#comment-13012680
 ] 

Uwe Schindler commented on LUCENE-3001:
---

The problem with the binary encoding in Lucene is/was: Lucene lacks a schema, 
so you cannot use the Document.get() methods on the search results to easy 
get the field. This is why the NumericField (which is *not* returned on 
retrieving search results) was simply encoded as String values. The stored 
field contents are not really relevant to search, the toString() was only added 
to NumericField to support an easy to use encoding.

The problem is again compatibility: Lucene must still support the old encoding 
and lots of software relies on it (because you can simply use Document.get() on 
search results). So I propose to extend NumericField to have it's own 
Field.Store enum (that provides both stored field encodings). The default 
precisionStep should not be changed (and solr should also switch to 4 as 
default).

Then somebody can use: new NumericField(precisionStep, 
NumericField.Store.STRING / NumericField.Store.BINARY,...). Solr can then also 
return NumericField instances in it getFieldable() method. For decoding search 
results some wrapper must be provided to Lucene (until we have a schema).

This is just a starting point for discussion.

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012689#comment-13012689
 ] 

Michael McCandless commented on LUCENE-3001:


With stored fields we are free to store flags w/ each stored field.  Meaning, 
each doc has its own private schema.

EG we can record that this field is a NumericField and store it in binary 
format.

Then, we can also provide a NumericField at search time (instead of normal 
field from converting the numeric value from indexing time, to a string).

Does that help?  Is the binary format better/preferred (because it's more 
compact)?

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012694#comment-13012694
 ] 

Yonik Seeley commented on LUCENE-3001:
--

bq. The default precisionStep should not be changed (and solr should also 
switch to 4 as default).

I don't see why Solr needs to match Lucene everywhere.  I tested myself, and 
the size deltas with smaller precision steps were pretty large.  I think Solr's 
defaults should stay as they are and only lowered when one desires a different 
tradeoff (faster range queries for larger index size).

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012705#comment-13012705
 ] 

Uwe Schindler commented on LUCENE-3001:
---

bq. I don't see why Solr needs to match Lucene everywhere. I tested myself, and 
the size deltas with smaller precision steps were pretty large. I think Solr's 
defaults should stay as they are and only lowered when one desires a different 
tradeoff (faster range queries for larger index size).

Those comments are contraproductive for this issue. So the correct way to solve 
this would be: Won't fix. For me as a pure-Lucene user this is of course the 
only correct fix to solve this :-)

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012723#comment-13012723
 ] 

Uwe Schindler commented on LUCENE-3001:
---

bq. EG we can record that this field is a NumericField and store it in binary 
format.

We have lots of unused bits in the index format for the type (compressed...). A 
numeric field is nothing different from a string or binary field in the 
underlying index format, all are byte[] in trunk. String uses an utf8 decoder 
to produce a string from byte[], NumericField uses this 
ByteBuffer.wrap(byte[]).getXxx()-like stuff. FieldsReader can handle this.

I already proposed that in 2.9 but we got strong -1 from some committers. This 
would make the API clean. I would propose that for 4.0 (maybe when we get 
codecs for stored fields).

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012725#comment-13012725
 ] 

Uwe Schindler commented on LUCENE-3001:
---

Just to add: Solr can handle this, too: If Document.getField() returns a 
instanceof NumericField, it uses the value from the NF; if it returns a normal 
Field it can do what it does now (decode manually). Backwards preserved, no 
problem.

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3001) Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o solr dependency

2011-03-29 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012737#comment-13012737
 ] 

Uwe Schindler commented on LUCENE-3001:
---

bq. The default precisionStep should not have to have to be standardized – solr 
can always set it in the constructor.

Thats my opinion, too. I was just confused about the initial commit where it 
was somehow redefined to 8 in Lucene.

The current ptoblem with NumericField is simply that you don't get one back 
when you do a Document.getField() on the search results. And this is why we did 
the toString() variant in 2.9.

Mike's proposal is in my opinion the best one, make FieldsReader return a 
NumericField if the bit is set. Then we can even store it the ByteBuffer-like 
way (how solr does at the moment). The encoding is then left over to the 
implementation details in Lucene. Just like the encoding of String fields is 
internal to the field type.

When we change the codec API to also handle stored fields, we maybe should also 
make the abstract Field API that communicates with the indexer simply pass a 
BytesRef to the indexer, just like TermToBytesRefAttribute on the indexing 
side. NumericField would simply implement this API different than StringField 
or BinaryField (or how we would call them).

 Add TrieFieldHelper lucene so we can write solr compatible Trie* fields w/o 
 solr dependency
 ---

 Key: LUCENE-3001
 URL: https://issues.apache.org/jira/browse/LUCENE-3001
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Ryan McKinley
Priority: Minor
 Attachments: LUCENE-3001-TrieFieldHelper.patch


 The solr support for numeric fields writes the stored value as binary vs the 
 lucene NumericField
 We should move this logic to a helper class in lucene core so that libraries 
 that do not depend on solr can write TrieFields that solr can read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org