RE: [jira] Commented: (LUCENE-1799) Unicode compression

2009-11-19 Thread Steven A Rowe
Hi Robert, On 11/18/2009 at 7:16 PM, Robert Muir wrote: Looking at the collation support, we could maybe improve IndexableBinaryStringTools by using char[]/byte[] with offset and length. The existing ByteBuffer/CharBuffer methods could stay, they are consistent with Charset api and are not

Re: [jira] Commented: (LUCENE-1799) Unicode compression

2009-11-19 Thread Robert Muir
Steven, do you still have a test setup to measure collation key generation performance with Lucene? On Thu, Nov 19, 2009 at 9:38 AM, Steven A Rowe sar...@syr.edu wrote: Hi Robert, On 11/18/2009 at 7:16 PM, Robert Muir wrote: Looking at the collation support, we could maybe improve

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-19 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780129#action_12780129 ] DM Smith commented on LUCENE-1799: -- The sample code is probably what is on this page,

RE: [jira] Commented: (LUCENE-1799) Unicode compression

2009-11-19 Thread Steven A Rowe
Hi Robert, Ack, actually two days ago I updated my Lucene trunk checkout and removed that code, thinking its utility had evaporated! But maybe IntelliJ will save my bacon in its local history cache. (Praise IntelliJ!) I'll check tonight when I get home. Steve On 11/19/2009 at 10:16 AM,

Re: [jira] Commented: (LUCENE-1799) Unicode compression

2009-11-19 Thread Robert Muir
doh! well if you have it, that will be very handy for verification. I'll create a separate issue for this shortly, maybe you can review the patch Thanks, Robert On Thu, Nov 19, 2009 at 1:06 PM, Steven A Rowe sar...@syr.edu wrote: Hi Robert, Ack, actually two days ago I updated my Lucene

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779442#action_12779442 ] Robert Muir commented on LUCENE-1799: - Earwin, if implemented as a directory, we lose

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779510#action_12779510 ] Earwin Burrfoot commented on LUCENE-1799: - Earwin, if implemented as a directory,

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779513#action_12779513 ] Robert Muir commented on LUCENE-1799: - bq. Waiting for flexible indexing, hoping it's

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779571#action_12779571 ] Michael McCandless commented on LUCENE-1799: The flex API will let you

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779577#action_12779577 ] Robert Muir commented on LUCENE-1799: - bq. The flex API will let you completely

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779576#action_12779576 ] Mark Miller commented on LUCENE-1799: - pretty simple though, isnt it? Just pull the

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779602#action_12779602 ] Earwin Burrfoot commented on LUCENE-1799: - bq. as far as the encoding itself,

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779621#action_12779621 ] Robert Muir commented on LUCENE-1799: - bq. ICU's API requires to use ByteBuffer and

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779629#action_12779629 ] Robert Muir commented on LUCENE-1799: - Earwin, i do not really like this

[jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779682#action_12779682 ] Earwin Burrfoot commented on LUCENE-1799: - bq. but then i guess we have to deal

Re: [jira] Commented: (LUCENE-1799) Unicode compression

2009-11-18 Thread Robert Muir
btw, does anyone have a guess at how expensive this ByteBuffer/CharBuffer.wrap() is? Looking at the collation support, we could maybe improve IndexableBinaryStringTools by using char[]/byte[] with offset and length. The existing ByteBuffer/CharBuffer methods could stay, they are consistent with

[jira] Commented: (LUCENE-1799) Unicode compression

2009-08-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741868#action_12741868 ] Earwin Burrfoot commented on LUCENE-1799: - I think right now this can be