[jira] Commented: (SOLR-1826) highlighting breaks when using WordDelimiterFilter and setting termOffsets=true

2010-03-24 Thread Sanjoy Ghosh (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849071#action_12849071 ] Sanjoy Ghosh commented on SOLR-1826: Just uploaded a patch that should fix this bug.

[jira] Updated: (SOLR-1826) highlighting breaks when using WordDelimiterFilter and setting termOffsets=true

2010-03-24 Thread Sanjoy Ghosh (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjoy Ghosh updated SOLR-1826: --- Attachment: SOLR-1826.patch This is a fix that ensures that overlapping tokens are sorted correctly.

Any bugs I can help on?

2010-03-24 Thread Sanjoy Ghosh
Hi,   I uploaded a patch for SOLR-1826.  I am planning to look at SOLR-1556  next.    I am starting to use Solr and would love to enhance my knowledge by fixing a few bugs.  But when I looked through the Open bugs on JIRA, I see that a lot of them have already been worked on.  A lot of them

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2010-03-24 Thread Thomas Heigl (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849091#action_12849091 ] Thomas Heigl commented on SOLR-799: --- Hello, For my current project I need to implement an

[jira] Commented: (SOLR-469) Data Import RequestHandler

2010-03-24 Thread Mis Tigi (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849093#action_12849093 ] Mis Tigi commented on SOLR-469: --- Thanks for everyone involved for this wonderful contribution.

Implementing near duplicate detection algorithm using IDF statistics

2010-03-24 Thread Thomas Heigl
Hello, For my current project I need to implement an index-time mechanism to detect (near) duplicate documents. The TextProfileSignature available out-of-the-box (http://wiki.apache.org/solr/Deduplication) seems alright but does not use global collection statistics in deciding which terms will

[jira] Commented: (SOLR-469) Data Import RequestHandler

2010-03-24 Thread Shalin Shekhar Mangar (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849121#action_12849121 ] Shalin Shekhar Mangar commented on SOLR-469: Thanks! Scheduling is not

[jira] Commented: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-03-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849127#action_12849127 ] Jörgen Rydenius commented on SOLR-1769: --- I also have problems with the solr master

[jira] Updated: (SOLR-1834) Document level security

2010-03-24 Thread Anders Rask (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anders Rask updated SOLR-1834: -- Attachment: html.rar HTML page describing the component and how to use it Document level security

[jira] Commented: (SOLR-1834) Document level security

2010-03-24 Thread Anders Rask (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849141#action_12849141 ] Anders Rask commented on SOLR-1834: --- Thank you for looking at the patch. I'm aware that

[jira] Commented: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-03-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849142#action_12849142 ] Jörgen Rydenius commented on SOLR-1769: --- Line 922 of ReplicationHandler.java looks

[jira] Created: (SOLR-1841) Unregistering of Searcher MBean doesn't work in Websphere

2010-03-24 Thread Patrik Nordebo (JIRA)
Unregistering of Searcher MBean doesn't work in Websphere - Key: SOLR-1841 URL: https://issues.apache.org/jira/browse/SOLR-1841 Project: Solr Issue Type: Bug Environment:

[jira] Updated: (SOLR-1841) Unregistering of Searcher MBean doesn't work in Websphere

2010-03-24 Thread Patrik Nordebo (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrik Nordebo updated SOLR-1841: - Attachment: patch Proposed fix: whenever the name returned from the register method differs from

[jira] Created: (SOLR-1842) DataImportHandler ODBC keeps lock on the source table while optimisatising is being run...

2010-03-24 Thread Marcin (JIRA)
DataImportHandler ODBC keeps lock on the source table while optimisatising is being run... -- Key: SOLR-1842 URL: https://issues.apache.org/jira/browse/SOLR-1842

[jira] Commented: (SOLR-1395) Integrate Katta

2010-03-24 Thread Sumit (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849155#action_12849155 ] Sumit commented on SOLR-1395: - Can some one help me out in integrating katta with solr. I am

[jira] Updated: (SOLR-1843) JMX name collision when running multiple SOLR instances/webapps in the same ServletContainer

2010-03-24 Thread Constantijn Visinescu (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Constantijn Visinescu updated SOLR-1843: Attachment: SolrConfig.java JmxMonitoredMap.java Based on revision

[jira] Created: (SOLR-1843) JMX name collision when running multiple SOLR instances/webapps in the same ServletContainer

2010-03-24 Thread Constantijn Visinescu (JIRA)
JMX name collision when running multiple SOLR instances/webapps in the same ServletContainer Key: SOLR-1843 URL: https://issues.apache.org/jira/browse/SOLR-1843

[jira] Issue Comment Edited: (SOLR-1843) JMX name collision when running multiple SOLR instances/webapps in the same ServletContainer

2010-03-24 Thread Constantijn Visinescu (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849181#action_12849181 ] Constantijn Visinescu edited comment on SOLR-1843 at 3/24/10 1:43 PM:

[jira] Updated: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-03-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jörgen Rydenius updated SOLR-1769: -- Attachment: SOLR-1769-nullcheck.patch This simple null-check solved things for me. Solr 1.4

[jira] Commented: (SOLR-799) Add support for hash based exact/near duplicate document handling

2010-03-24 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849225#action_12849225 ] Andrzej Bialecki commented on SOLR-799: This issue is closed - please use the

[jira] Commented: (SOLR-1769) Solr 1.4 Replication - Repeater throwing NullPointerException

2010-03-24 Thread Noble Paul (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849232#action_12849232 ] Noble Paul commented on SOLR-1769: -- This can be checked in . I wonder what is the

[jira] Commented: (SOLR-469) Data Import RequestHandler

2010-03-24 Thread David Smiley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849299#action_12849299 ] David Smiley commented on SOLR-469: --- However doing so is a protocol crime -- HTTP GET verb

Re: Implementing near duplicate detection algorithm using IDF statistics

2010-03-24 Thread Ted Dunning
For reference, you can get a rental copy of this article for less than the cost of the full PDF download here: http://www.deepdyve.com/lp/association-for-computing-machinery/collection-statistics-for-fast-duplicate-document-detection-0o7i3Sx0Wd (joining the ACM is also a good thing to do) (and

[jira] Created: (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2010-03-24 Thread David Smiley (JIRA)
CommonGramsQueryFilterFactory should read words in a comma-delimited format --- Key: SOLR-1844 URL: https://issues.apache.org/jira/browse/SOLR-1844 Project: Solr Issue

CloudSolrServer dependency on ZkController in the cloud branch

2010-03-24 Thread Igor Motov
I was experimenting with the latest (r919455) revision of the cloud branch and noticed a dependency between CloudSolrServer and ZkController. It looks like CloudSolrServer is still using two constants from ZkController: NODE_NAME and URL_PROP. After moving these two constants to ZkStateReader, I

Re: CloudSolrServer dependency on ZkController in the cloud branch

2010-03-24 Thread Mark Miller
On 03/24/2010 06:43 PM, Igor Motov wrote: I was experimenting with the latest (r919455) revision of the cloud branch and noticed a dependency between CloudSolrServer and ZkController. It looks like CloudSolrServer is still using two constants from ZkController: NODE_NAME and URL_PROP. After

[jira] Commented: (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2010-03-24 Thread David Smiley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12849571#action_12849571 ] David Smiley commented on SOLR-1844: It _does_ support comments; sorry.