Re: Incremental Field Updates

2010-05-12 Thread Babak Farhang
> Of course, it raises an interesting point, what are the implications for > numeric fields? Not sure whether you're referring to the general or the specific, but with the approach Shai is proposing, if the numeric fields are indexed using the new trie structures, then it would be important to pr

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866524#action_12866524 ] Michael McCandless commented on LUCENE-2455: bq. But, why wouldn't they be abl

[jira] Commented: (LUCENE-1585) Allow to control how payloads are merged

2010-05-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866525#action_12866525 ] Michael McCandless commented on LUCENE-1585: bq. Still, I think that I'm most

[jira] Commented: (LUCENE-2410) Optimize PhraseQuery

2010-05-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866526#action_12866526 ] Michael McCandless commented on LUCENE-2410: Looks great Robert -- I think you

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866528#action_12866528 ] Michael McCandless commented on LUCENE-2458: This is sneaky behavior on QueryP

Re: Incremental Field Updates

2010-05-12 Thread Michael McCandless
I think this would work perfectly fine w/ Shai's approach... To Lucene a NumericField is just a series of terms w/ no positions indexed. So when a value is changed, we'd get a new series of terms, do the delta, and then subtract & add accordingly in the stacked segments. Mike On Wed, May 12, 20

RE: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Itamar Syn-Hershko
The QueryParser also fails to correctly parse Hebrew acronyms; although not being an integral part of the current discussion, I thought this would be the best place to bring that up. Hebrew acronyms are assembled of letters with a single double-quote char within, example: MNK"L (Hebrew for CEO). T

[jira] Commented: (LUCENE-1585) Allow to control how payloads are merged

2010-05-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866537#action_12866537 ] Shai Erera commented on LUCENE-1585: bq. What was the issue w/ that approach again? a

[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-12 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866539#action_12866539 ] Shai Erera commented on LUCENE-2455: bq. Adding indexes using FilterIndexReader is use

[jira] Created: (SOLR-1908) Deduplication removes all docs

2010-05-12 Thread Markus (JIRA)
Deduplication removes all docs -- Key: SOLR-1908 URL: https://issues.apache.org/jira/browse/SOLR-1908 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Markus Priori

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866557#action_12866557 ] Robert Muir commented on LUCENE-2458: - {quote} What are some real use-cases where this

[jira] Updated: (SOLR-1908) Deduplication removes all docs

2010-05-12 Thread Markus (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus updated SOLR-1908: - Description: Dedupe removes all documents from the index if overwriteDupes=true and the schema's signature field

Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir
On Wed, May 12, 2010 at 6:05 AM, Itamar Syn-Hershko wrote: > The QueryParser also fails to correctly parse Hebrew acronyms; although not > being an integral part of the current discussion, I thought this would be > the best place to bring that up. > Just as I don't think Analysis should do QueryP

[jira] Updated: (SOLR-1908) Deduplication removes all docs

2010-05-12 Thread Markus (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus updated SOLR-1908: - Component/s: update > Deduplication removes all docs > -- > > Key: SOL

[jira] Commented: (LUCENE-2410) Optimize PhraseQuery

2010-05-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866572#action_12866572 ] Robert Muir commented on LUCENE-2410: - Committed revisions 943493 (trunk), 943499 (3x)

[jira] Created: (SOLR-1909) Return information on found duplicates during update

2010-05-12 Thread Markus (JIRA)
Return information on found duplicates during update Key: SOLR-1909 URL: https://issues.apache.org/jira/browse/SOLR-1909 Project: Solr Issue Type: Improvement Components: update

[jira] Updated: (SOLR-1909) Return information on found duplicates during update

2010-05-12 Thread Markus (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus updated SOLR-1909: - Description: Deduplication does not return any information in its response object about found (and optionally ove

Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Mark Miller
On 5/12/10 9:25 AM, Robert Muir wrote: (and, contrary to what you would believe from the documentation, the choice of whether or not to make a PhraseQuery is not based on syntax one bit!) Thats a major exaggeration - quoting text plays a large role in whether or not you will get a phrase quer

Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir
On Wed, May 12, 2010 at 11:16 AM, Mark Miller wrote: > > Thats a major exaggeration - quoting text plays a large role in whether or > not you will get a phrase query. > No, it has nothing to do with it in the implementation. It only "escapes the whitespace", but is discarded. This is clear from l

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866595#action_12866595 ] Marvin Humphrey commented on LUCENE-2458: - I have mixed feelings about this for En

Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Mark Miller
On 5/12/10 11:24 AM, Robert Muir wrote: On Wed, May 12, 2010 at 11:16 AM, Mark Miller wrote: Thats a major exaggeration - quoting text plays a large role in whether or not you will get a phrase query. No, it has nothing to do with it in the implementation. It only "escapes the whitespace",

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866603#action_12866603 ] Marvin Humphrey commented on LUCENE-2458: - > Because they show its 10x better to u

[jira] Commented: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-05-12 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866606#action_12866606 ] Tom Burton-West commented on LUCENE-2393: - I tweaked the latest patch to mimic the

Problems passing PyLucene objects to jcc-wrapped bobo-browse api

2010-05-12 Thread Julian Maibaum
Hi, Christian Heimes, Dirk Rothe, and I have jcc-wrapped bobo-browse (http://code.google.com/p/bobo-browse/) in order to add faceted search capabilities to PyLucene. However, the two modules don't play well together, as wrappers from PyLucene cannot be used in a bobo-browse context and vice ve

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866648#action_12866648 ] Robert Muir commented on LUCENE-2458: - {quote} As described in another recent thread,

[jira] Issue Comment Edited: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866648#action_12866648 ] Robert Muir edited comment on LUCENE-2458 at 5/12/10 1:40 PM: --

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Ivan Provalov (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1285#action_1285 ] Ivan Provalov commented on LUCENE-2458: --- Robert has asked me to post our test result

[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-05-12 Thread Peter Sturge (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866674#action_12866674 ] Peter Sturge commented on SOLR-1163: Hi Uri, Really like what you've done here. +1 +vot

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Marvin Humphrey (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866693#action_12866693 ] Marvin Humphrey commented on LUCENE-2458: - > I'm honestly having a tough time seei

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866696#action_12866696 ] Hoss Man commented on LUCENE-2458: -- bq. Instead the queryparser should only form phrasequ

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866695#action_12866695 ] Robert Muir commented on LUCENE-2458: - {quote} Change the initial split on whitespace

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866698#action_12866698 ] Robert Muir commented on LUCENE-2458: - bq. but all other things being equal lets keep

[jira] Updated: (SOLR-896) Solr Query Parser Plugin for Mark Miller's Qsol Parser

2010-05-12 Thread Chris Harris (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Harris updated SOLR-896: -- Attachment: SOLR-896.patch Slightly revised patch: * Now we support the df (CommonParams.DF, aka "defaul

Re: Indexing a Reader instead of a String to a field value

2010-05-12 Thread Chris Hostetter
: I have a DIH setup in which I obtain a java.io.Reader for a field's value. : It's a reader because I'm getting it from a source that may store a lot of : text. I traced the value of a field, stored for quite some time as an : Object, through Solr until it got to Solr's DocumentBuilder line ~27

[jira] Created: (SOLR-1910) Add hl.df (highlight-specific default field) param, so highlighting can have a separate analysis path

2010-05-12 Thread Chris Harris (JIRA)
Add hl.df (highlight-specific default field) param, so highlighting can have a separate analysis path - Key: SOLR-1910 URL: https://issues.apache.org/jira/browse/S

[jira] Updated: (SOLR-1910) Add hl.df (highlight-specific default field) param, so highlighting can have a separate analysis path from search

2010-05-12 Thread Chris Harris (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Harris updated SOLR-1910: --- Summary: Add hl.df (highlight-specific default field) param, so highlighting can have a separate analy

[jira] Updated: (SOLR-1910) Add hl.df (highlight-specific default field) param, so highlighting can have a separate analysis path from search

2010-05-12 Thread Chris Harris (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Harris updated SOLR-1910: --- Attachment: SOLR-1910.patch > Add hl.df (highlight-specific default field) param, so highlighting can

[jira] Updated: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-05-12 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated LUCENE-2393: Attachment: LUCENE-2393.patch Rewrote argument processing so the default behavior is that

[jira] Updated: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-05-12 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated LUCENE-2393: Attachment: (was: LUCENE-2393) > Utility to output total term frequency and df from a

RE: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Itamar Syn-Hershko
Never did I request the QP to do Analysis. I simply mentioned this bug - what this definitely is - so you could tackle it while you're at it. This is an definitely relevant to a discussion about re-making how the QP determines what is a legit PhraseQuery and what is not. The fix is quite easy I be

Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir
On Wed, May 12, 2010 at 6:30 PM, Itamar Syn-Hershko wrote: > Never did I request the QP to do Analysis. I simply mentioned this bug - > what this definitely is - Its definitely not a bug for Hebrew, there is a unicode character for gershayim (U+05F4), so technically this should be used according

[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-05-12 Thread Uri Boness (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866769#action_12866769 ] Uri Boness commented on SOLR-1163: -- Hi Peter, The explorer communicates with Solr via http

RE: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Itamar Syn-Hershko
I think we understand each other perfectly well. I still think resolving this is very simple, by just applying a correct logic (ignore double-quotes followed by a char) which isn't enforced today and once it will be, it won't cause any cases of unexpected behavior. This isn't an analysis related ta

[jira] Commented: (LUCENE-2453) Make Index Output Buffer Size Configurable

2010-05-12 Thread Karthick Sankarachary (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866780#action_12866780 ] Karthick Sankarachary commented on LUCENE-2453: --- Hi Shai, To answer your co

[jira] Updated: (LUCENE-2453) Make Index Output Buffer Size Configurable

2010-05-12 Thread Karthick Sankarachary (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthick Sankarachary updated LUCENE-2453: -- Attachment: (was: LUCENE-2453.patch) > Make Index Output Buffer Size Confi

[jira] Updated: (LUCENE-2453) Make Index Output Buffer Size Configurable

2010-05-12 Thread Karthick Sankarachary (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthick Sankarachary updated LUCENE-2453: -- Attachment: LUCENE-2453.patch > Make Index Output Buffer Size Configurable > -

[jira] Updated: (SOLR-1908) Deduplication removes all docs

2010-05-12 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1908: --- Attachment: SOLR-1908.patch First pass at a test & fix -- allowed signatureField to be un-indexed (for people

[jira] Updated: (SOLR-1908) Deduplication removes all docs

2010-05-12 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1908: --- Attachment: SOLR-1908.patch after posting the last patch i remembered that URPFs could be SolrCoreAware, so h

[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-05-12 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866926#action_12866926 ] Lance Norskog commented on SOLR-1163: - That works great. Another problem I've encounter

Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Robert Muir
Internationalization doesn't work by just piling hacks for language X, language Y, and language Z on top of each other. Just like I want the English hack removed, I strongly recommend against adding any Hebrew hack. On Wed, May 12, 2010 at 6:55 PM, Itamar Syn-Hershko wrote: > I think we understa

[jira] Updated: (SOLR-1801) SignatureUpdateProcessor should include all computed signatures in SolrQueryResponse

2010-05-12 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-1801: --- Summary: SignatureUpdateProcessor should include all computed signatures in SolrQueryResponse (was: Delet

[jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866954#action_12866954 ] DM Smith commented on LUCENE-2458: -- As I see it there are two issues: 1) Backward compati

RE: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate phrasequeries based on term count

2010-05-12 Thread Itamar Syn-Hershko
Again, this is not a hack, and that was exactly my point. As I said: > resolving this is very simple, by just applying a correct logic > (ignore double-quotes followed by a char) which isn't enforced today > and once it will be, it won't cause any cases of unexpected behavior. It is just valid