from:"Markus Jelsma $JIRA$"

[jira] [Updated] (SOLR-4018) Dispatcher to optionally write QTime and Hits HTTP header

2012-10-31 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-4018:


Description: 
SolrDispatchFilter should be able to write QTime and Hits HTTP headers via 
configuration.

{code}
  



  
{code}

  was:SolrDispatchFilter should be able to write QTime and Hits HTTP headers 
via configuration.


> Dispatcher to optionally write QTime and Hits HTTP header
> -
>
> Key: SOLR-4018
> URL: https://issues.apache.org/jira/browse/SOLR-4018
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
>
> SolrDispatchFilter should be able to write QTime and Hits HTTP headers via 
> configuration.
> {code}
>   
>  multipartUploadLimitInKB="2048000" />
> 
> 
>   
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4018) Dispatcher to optionally write QTime and Hits HTTP header

2012-10-31 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-4018:


Attachment: SOLR-4018-trunk-1.patch

Here's a patch for trunk adding two fields to SolrConfig and checking them and 
adding headers in SolrDispatchFilter.

> Dispatcher to optionally write QTime and Hits HTTP header
> -
>
> Key: SOLR-4018
> URL: https://issues.apache.org/jira/browse/SOLR-4018
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-4018-trunk-1.patch
>
>
> SolrDispatchFilter should be able to write QTime and Hits HTTP headers via 
> configuration.
> {code}
>   
>  multipartUploadLimitInKB="2048000" />
> 
> 
>   
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4018) Dispatcher to optionally write QTime and Hits HTTP header

2012-10-31 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-4018:
---

 Summary: Dispatcher to optionally write QTime and Hits HTTP header
 Key: SOLR-4018
 URL: https://issues.apache.org/jira/browse/SOLR-4018
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0


SolrDispatchFilter should be able to write QTime and Hits HTTP headers via 
configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3966) LangID not to log WARN

2012-10-23 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3966:


Attachment: SOLR-3966-trunk-1.patch

> LangID not to log WARN
> --
>
> Key: SOLR-3966
> URL: https://issues.apache.org/jira/browse/SOLR-3966
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-3966-trunk-1.patch
>
>
> The LangID UpdateProcessor emits the warning below for documents that do not 
> contain an input field. The level should go to DEBUG or be removed. It is not 
> uncommon to see a log full of these messages just because not all documents 
> contain all the fields we're mapping. 
> {code}Oct 19, 2012 11:23:43 AM 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor process
> WARNING: Document  does not contain input field . Skipping 
> this{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3966) LangID not to log WARN

2012-10-23 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3966:


Attachment: (was: SOLR-3966-trunk-1.patch)

> LangID not to log WARN
> --
>
> Key: SOLR-3966
> URL: https://issues.apache.org/jira/browse/SOLR-3966
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-3966-trunk-1.patch
>
>
> The LangID UpdateProcessor emits the warning below for documents that do not 
> contain an input field. The level should go to DEBUG or be removed. It is not 
> uncommon to see a log full of these messages just because not all documents 
> contain all the fields we're mapping. 
> {code}Oct 19, 2012 11:23:43 AM 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor process
> WARNING: Document  does not contain input field . Skipping 
> this{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3966) LangID not to log WARN

2012-10-23 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3966:


Attachment: SOLR-3966-trunk-1.patch

Patch removing the warning.

> LangID not to log WARN
> --
>
> Key: SOLR-3966
> URL: https://issues.apache.org/jira/browse/SOLR-3966
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.0
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-3966-trunk-1.patch
>
>
> The LangID UpdateProcessor emits the warning below for documents that do not 
> contain an input field. The level should go to DEBUG or be removed. It is not 
> uncommon to see a log full of these messages just because not all documents 
> contain all the fields we're mapping. 
> {code}Oct 19, 2012 11:23:43 AM 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor process
> WARNING: Document  does not contain input field . Skipping 
> this{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3966) LangID not to log WARN

2012-10-19 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3966:
---

 Summary: LangID not to log WARN
 Key: SOLR-3966
 URL: https://issues.apache.org/jira/browse/SOLR-3966
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0


The LangID UpdateProcessor emits the warning below for documents that do not 
contain an input field. The level should go to DEBUG or be removed. It is not 
uncommon to see a log full of these messages just because not all documents 
contain all the fields we're mapping. 

{code}Oct 19, 2012 11:23:43 AM 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor process
WARNING: Document  does not contain input field . Skipping 
this{code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3705) hl.alternateField does not support glob

2012-10-18 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478929#comment-13478929
 ] 

Markus Jelsma commented on SOLR-3705:
-

I took me a while to get back to this. The problem is that we have over 20 full 
text fields per document of which only one contains the text, they're all 
language specific fields like content_en, content_de etc. We use 
hl.fl=content_* to get highlighted snippets for whatever field is matched by 
the main query. But a document can also be matched on a non-content field so 
the highlighter won't find a snippet for the content field. We though that if 
we could glob the alternate field as well, it would be a simple mechanism to 
get a snippet from an alternate field that is any of the content_* fields.

> hl.alternateField does not support glob
> ---
>
> Key: SOLR-3705
> URL: https://issues.apache.org/jira/browse/SOLR-3705
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Affects Versions: 4.0-ALPHA
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 5.0
>
>
> Unlike hl.fl, the hl.alternateField does not support * to match field globs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3925) Expose SpanFirst in eDismax

2012-10-10 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3925:


Attachment: SOLR-3925-trunk-1.patch

> Expose SpanFirst in eDismax
> ---
>
> Key: SOLR-3925
> URL: https://issues.apache.org/jira/browse/SOLR-3925
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 4.0-BETA
> Environment: solr-spec 5.0.0.2012.10.09.19.29.59
> solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-3925-trunk-1.patch
>
>
> Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
> This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
> formatted value.
> For example, sf=title~5^2 will give a boost of 2 if one of the normal 
> clauses, originally generated for automatic phrase queries, is located within 
> five positions from the field's start.
> Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3925) Expose SpanFirst in eDismax

2012-10-10 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3925:
---

 Summary: Expose SpanFirst in eDismax
 Key: SOLR-3925
 URL: https://issues.apache.org/jira/browse/SOLR-3925
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0
 Attachments: SOLR-3925-trunk-1.patch

Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
formatted value.

For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, 
originally generated for automatic phrase queries, is located within five 
positions from the field's start.

Unit test is included and all tests pass.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-4470) Expose SpanFirst in eDismax

2012-10-10 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma closed LUCENE-4470.
-

   Resolution: Invalid
Fix Version/s: (was: 5.0)
   (was: 4.1)

Accidentally added to Lucene. I'll close and open in the Solr project.
Sorry.

> Expose SpanFirst in eDismax
> ---
>
> Key: LUCENE-4470
> URL: https://issues.apache.org/jira/browse/LUCENE-4470
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/queryparser
>Affects Versions: 4.0-BETA
> Environment: solr-spec 5.0.0.2012.10.09.19.29.59
> solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
>Reporter: Markus Jelsma
> Attachments: SOLR-4470-trunk-1.patch
>
>
> Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
> This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
> formatted value.
> For example, sf=title~5^2 will give a boost of 2 if one of the normal 
> clauses, originally generated for automatic phrase queries, is located within 
> five positions from the field's start.
> Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4470) Expose SpanFirst in eDismax

2012-10-10 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated LUCENE-4470:
--

Attachment: SOLR-4470-trunk-1.patch

> Expose SpanFirst in eDismax
> ---
>
> Key: LUCENE-4470
> URL: https://issues.apache.org/jira/browse/LUCENE-4470
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/queryparser
>Affects Versions: 4.0-BETA
> Environment: solr-spec 5.0.0.2012.10.09.19.29.59
> solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
>Reporter: Markus Jelsma
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-4470-trunk-1.patch
>
>
> Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
> This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
> formatted value.
> For example, sf=title~5^2 will give a boost of 2 if one of the normal 
> clauses, originally generated for automatic phrase queries, is located within 
> five positions from the field's start.
> Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4470) Expose SpanFirst in eDismax

2012-10-10 Thread Markus Jelsma (JIRA)

Markus Jelsma created LUCENE-4470:
-

 Summary: Expose SpanFirst in eDismax
 Key: LUCENE-4470
 URL: https://issues.apache.org/jira/browse/LUCENE-4470
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/queryparser
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0


Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
formatted value.

For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, 
originally generated for automatic phrase queries, is located within five 
positions from the field's start.

Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-09-19 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458679#comment-13458679
 ] 

Markus Jelsma commented on SOLR-3685:
-

{quote}So what we're seeing here is the mmapped nodes use more RES and SHR than 
the NIO node. VIRT is as expected. I'll change another node to NIO and keep 
them running again for the next few days and keep sending documents and firing 
queries.{quote}

there is still an issue with mmap and high RES opposed to NIOFS but the actual 
issue is already resolved. I'll open a new issue.

> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log, oom-killer.log, pmap.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3808) Extraction contrib to utilize Boilerpipe

2012-09-14 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455680#comment-13455680
 ] 

Markus Jelsma commented on SOLR-3808:
-

Hi - in Apache Nutch i keep the loaded extractors in a static hashmap. The content handlers have to be wrapped like this and the extractor 
implementation has to be passed to the BoilerpipeContentHandler constructor, it 
doesn't use configuration to find an extractor.

> Extraction contrib to utilize Boilerpipe
> 
>
> Key: SOLR-3808
> URL: https://issues.apache.org/jira/browse/SOLR-3808
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Reporter: Markus Jelsma
>Priority: Minor
> Attachments: SOLR-3808-trunk-1.patch
>
>
> Solr's extraction contrib uses Tika for document parsing and should be able 
> te use Boilerpipe. Tika comes with Boilerpipe, a library capable of removing 
> boilerplate text from HTML pages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3808) Extraction contrib to utilize Boilerpipe

2012-09-07 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3808:


Attachment: SOLR-3808-trunk-1.patch

Here's a patch for trunk. It introduces the boilerpipe parameter which takes a 
names Boilerpipe extractor such as ArticleExtractor, CanolaExtractor or 
KeepEverythingExtractor.

It also comes with a unit test. The test HTML file's extracted content will not 
contain the word `footer` if the ArticleExtractor is used.

> Extraction contrib to utilize Boilerpipe
> 
>
> Key: SOLR-3808
> URL: https://issues.apache.org/jira/browse/SOLR-3808
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3808-trunk-1.patch
>
>
> Solr's extraction contrib uses Tika for document parsing and should be able 
> te use Boilerpipe. Tika comes with Boilerpipe, a library capable of removing 
> boilerplate text from HTML pages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3808) Extraction contrib to utilize Boilerpipe

2012-09-07 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3808:
---

 Summary: Extraction contrib to utilize Boilerpipe
 Key: SOLR-3808
 URL: https://issues.apache.org/jira/browse/SOLR-3808
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.0


Solr's extraction contrib uses Tika for document parsing and should be able te 
use Boilerpipe. Tika comes with Boilerpipe, a library capable of removing 
boilerplate text from HTML pages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3782) A leader going down while updates are coming in can cause shard inconsistency.

2012-09-04 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447644#comment-13447644
 ] 

Markus Jelsma commented on SOLR-3782:
-

hi - this is in CHANGES but is it resolved?

> A leader going down while updates are coming in can cause shard inconsistency.
> --
>
> Key: SOLR-3782
> URL: https://issues.apache.org/jira/browse/SOLR-3782
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
>
> Harpoon into the head of the great whale I have been chasing for a couple 
> weeks now.
> ChaosMonkey test was exposing this.
> Turns out the problem was the solr cmd distrib executor - when closing the 
> leader CoreContainer, we would close the zkController while updates can still 
> flow through the distrib executor. The result was that we would send updates 
> from the leader briefly even though there was a new leader.
> I had suspected something similar to this at one point in the hunt and 
> started adding some defensive state checks that we wanted to add anyway. I 
> don't think they caught all of this issue due to the limited tightness one of 
> the state checks can get to (checking the cloudstate leader from a replica 
> against the leader indicated by the request).
> So the answer is to finally work out how to stop the solr cmd distrib 
> executor - because we need to stop it before closing zkController and giving 
> up our role as leader.
> I've worked that all out and the issue no longer seems to be a problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-08-20 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437790#comment-13437790
 ] 

Markus Jelsma commented on SOLR-3685:
-

To my surprise the RES for all nodes except the NIOFS node increased slowly 
over the past three days and were still increasing today. The mmapped nodes 
used sometimes up to three times the Xmx and, for some reason, about 1/2 Xmx of 
shared memory. We just restarted all nodes with 9 using mmap and one using NIO, 
after restart the mmapped nodes immediately start to use a lot more RES than 
the NIO node. The NIO node also uses much less shared memory.

Perhaps what i've seen before with NIO also crashing was due to some other 
issue.

So what we're seeing here is the mmapped nodes use more RES and SHR than the 
NIO node. VIRT is as expected. I'll change another node to NIO and keep them 
running again for the next few days and keep sending documents and firing 
queries.

All nodes are using august 20th trunk from now on.



> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log, oom-killer.log, pmap.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-08-17 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3685:


Attachment: pmap.log

Here's the pmap for one node. Heap Xmx is still 256M. I also just noticed this 
node still has OpenJDK 6 running instead of Sun Java 6 like the other nodes. 
Despite that difference the memory consumption is equal.
I'll also restart a node with NIOFS but i still expect memory to increase as 
with mmap.

> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log, oom-killer.log, pmap.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication

2012-08-16 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436214#comment-13436214
 ] 

Markus Jelsma edited comment on SOLR-3685 at 8/17/12 5:48 AM:
--

We didn't think mmap could be the cause but nevertheless we tried that once on 
a smaller cluster and got a lot of memory consumption again, after which it got 
killed.
I can see if i can run one or two of the nodes with NIOFS but let the other run 
with mmap. We don't automatically restart cores so it should run fine if we 
temporarily change the config in zookeeper and restart two nodes.

edit: each core has a ~2.5GB index.

  was (Author: markus17):
We didn't think mmap could be the cause but nevertheless we tried that once 
on a smaller cluster and got a lot of memory consumption again, after which it 
got killed.
I can see if i can run one or two of the nodes with NIOFS but let the other run 
with mmap. We don't automatically restart cores so it should run fine if we 
temporarily change the config in zookeeper and restart two nodes.
  
> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log, oom-killer.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-08-16 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436214#comment-13436214
 ] 

Markus Jelsma commented on SOLR-3685:
-

We didn't think mmap could be the cause but nevertheless we tried that once on 
a smaller cluster and got a lot of memory consumption again, after which it got 
killed.
I can see if i can run one or two of the nodes with NIOFS but let the other run 
with mmap. We don't automatically restart cores so it should run fine if we 
temporarily change the config in zookeeper and restart two nodes.

> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log, oom-killer.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-08-16 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3685:


Attachment: oom-killer.log

Here's the relevant part of syslog for a node where Tomcat is killed by the OS. 
There is 1G or available RAM, no configured swap and the heap size is 256MB. 
The node has two running cores.

The off-heap RES memory for the Java process sometimes gets so large that Linux 
decides to kill it.

> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log, oom-killer.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-08-16 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436194#comment-13436194
 ] 

Markus Jelsma commented on SOLR-3685:
-

One node also got rsyslogd killed but the other survived. I assume the 
OOMkiller output of Linux is what you refer to?

> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) Solr Cloud sometimes skipped peersync attempt and replicated instead due to tlog flags not being cleared when no updates were buffered during a previous replication.

2012-08-16 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436182#comment-13436182
 ] 

Markus Jelsma commented on SOLR-3685:
-

Finally! Two nodes failed again and got killed by the OS. All nodes have a lot 
of off-heap RES memory, sometimes 3x higher than the heap which is a meager 
256MB.

Got a name suggestion for the memory issue? I'll open one tomorrow and link to 
this one.

> Solr Cloud sometimes skipped peersync attempt and replicated instead due to 
> tlog flags not being cleared when no updates were buffered during a previous 
> replication.
> -
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Assignee: Yonik Seeley
>Priority: Critical
> Fix For: 4.0, 5.0
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2789) Add showItems to LRUCache, display extra details about cached data for both LRUCache and FastLRUCache

2012-08-14 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434079#comment-13434079
 ] 

Markus Jelsma commented on SOLR-2789:
-

This issue is not resolved but it's documented as-if it was:
http://wiki.apache.org/solr/SolrCaching#showItems

> Add showItems to LRUCache, display extra details about cached data for both 
> LRUCache and FastLRUCache
> -
>
> Key: SOLR-2789
> URL: https://issues.apache.org/jira/browse/SOLR-2789
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 3.4
>Reporter: Eric Pugh
>Priority: Minor
> Attachments: cache_details.patch
>
>
> I noticed that the showItems parameter for cache configuration applies to 
> FastLRUCache but not LRUCache.  So I added it.  I also noticed that the key 
> data about an item that is QueryResult cache hit wasn't very useful, but was 
> for document and filter cache, so added specific support for QueryResultKey.
> This patch should probably be merged with the work in SOLR-1893, which is a 
> proposed refactor.  I'm happy to merge these two into a single patch if 
> someone things that is a good idea.
> The output for a queryresult cache hit now looks like:
> 0.66
> 43
> 0
>  sort=null)">org.apache.solr.search.DocSlice@1e66a917
> org.apache.solr.search.DocSlice@7bfcb845
> I've also updated the example solrconfig.xml in order to better show off the 
> showItems parameter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1238) exception in solrJ when authentication is used

2012-08-07 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430701#comment-13430701
 ] 

Markus Jelsma commented on SOLR-1238:
-

We've seen this issue happening over the past few years with or without 
authentication using SolrJ. Perhaps this issue could be renamed and marked for 
current Solr versions if applicable.

I can't remember seeing this exception when using Curl to load data.

> exception in solrJ when authentication is used
> --
>
> Key: SOLR-1238
> URL: https://issues.apache.org/jira/browse/SOLR-1238
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Attachments: SOLR-1238.patch
>
>
> see the thread http://markmail.org/thread/w36ih2fnphbubian
> {code}
> I am facing getting error when I am using Authentication in Solr. I
> followed Wiki. The error doesnot appear when I searching. Below is the
> code snippet and the error.
> Please note I am using Solr 1.4 Development build from SVN.
>HttpClient client=new HttpClient();
>AuthScope scope = new 
> AuthScope(AuthScope.ANY_HOST,AuthScope.ANY_PORT,null, null);
>client.getState().setCredentials(scope,new 
> UsernamePasswordCredentials("guest", "guest"));
>SolrServer server =new 
> CommonsHttpSolrServer("http://localhost:8983/solr",client);
>SolrInputDocument doc1=new SolrInputDocument();
>//Add fields to the document
>doc1.addField("employeeid", "1237");
>doc1.addField("employeename", "Ann");
>doc1.addField("employeeunit", "etc");
>doc1.addField("employeedoj", "1995-11-31T23:59:59Z");
>server.add(doc1);
> Exception in thread "main"
> org.apache.solr.client.solrj.SolrServerException:
> org.apache.commons.httpclient.ProtocolException: Unbuffered entity
> enclosing request can not be repeated.
>at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:468)
>at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242)
>at 
> org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:259)
>at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:63)
>at test.SolrAuthenticationTest.(SolrAuthenticationTest.java:49)
>at test.SolrAuthenticationTest.main(SolrAuthenticationTest.java:113)
> Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered
> entity enclosing request can not be repeated.
>at 
> org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:487)
>at 
> org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
>at 
> org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
>at 
> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
>at 
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:415)
>... 5 more.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-08-07 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430438#comment-13430438
 ] 

Markus Jelsma commented on SOLR-3685:
-

When exactly? Do you have an issue?

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-08-06 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429132#comment-13429132
 ] 

Markus Jelsma commented on SOLR-3685:
-

Each node has two cores and allow only one warming searcher at any time. The 
problem is triggered on start up after graceful shutdown as well as a hard 
power off. I've seen it happening not only when the whole cluster if restarted 
(i don't think i've ever done that) but just one node of the 6 shard 2 replica 
test cluster.

The attached log is of one node being restarted out of the whole cluster.

Could the off-heap RAM be part of data being sent over the wire?

We've worked around the problem for now by getting more RAM.

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3473) Distributed deduplication broken

2012-08-06 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3473:


Attachment: SOLR-3473-trunk-2.patch

Hello - Could the deleteByQuery issue you mention be fixed with SOLR-3473? I've 
attached an updated patch for today's trunk. The previous patch was missing the 
signature field but i added it to one schema. Now other tests seem to fail 
because they don't see the sig field but do use the update chain.

Anyway, it seems the BasicDistributedZkTest passes but i'm not very sure, 
there's too much log output but it doesn't fail.

> Distributed deduplication broken
> 
>
> Key: SOLR-3473
> URL: https://issues.apache.org/jira/browse/SOLR-3473
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, update
>Affects Versions: 4.0-ALPHA
>Reporter: Markus Jelsma
> Fix For: 4.0
>
> Attachments: SOLR-3473-trunk-2.patch, SOLR-3473.patch, SOLR-3473.patch
>
>
> Solr's deduplication via the SignatureUpdateProcessor is broken for 
> distributed updates on SolrCloud.
> Mark Miller:
> {quote}
> Looking again at the SignatureUpdateProcessor code, I think that indeed this 
> won't currently work with distrib updates. Could you file a JIRA issue for 
> that? The problem is that we convert update commands into solr documents - 
> and that can cause a loss of info if an update proc modifies the update 
> command.
> I think the reason that you see a multiple values error when you try the 
> other order is because of the lack of a document clone (the other issue I 
> mentioned a few emails back). Addressing that won't solve your issue though - 
> we have to come up with a way to propagate the currently lost info on the 
> update command.
> {quote}
> Please see the ML thread for the full discussion: 
> http://lucene.472066.n3.nabble.com/SolrCloud-deduplication-td3984657.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3705) hl.alternateField does not support glob

2012-08-03 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3705:
---

 Summary: hl.alternateField does not support glob
 Key: SOLR-3705
 URL: https://issues.apache.org/jira/browse/SOLR-3705
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.0-ALPHA
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 5.0


Unlike hl.fl, the hl.alternateField does not support * to match field globs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-30 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424785#comment-13424785
 ] 

Markus Jelsma commented on SOLR-3685:
-

Hi,

1. Yes, but we allow only one searcher at the same time to be warmed. This 
resource usage also belongs to the Java heap, it cannot cause 5x as much heap 
being allocated.

2. Yes, i'll open a new issue and refer to this.

3. Well, in some logs i clearly see a core is attempting to download and 
judging from the multiple index directories it's true. I am very sure no 
updates have been added to the cluster for a long time yet it still attempts to 
recover. Below is a core recovering.

{code}
2012-07-30 09:48:36,970 INFO [solr.cloud.ZkController] - [main] - : We are 
http://nl2.index.openindex.io:8080/solr/openindex_a/ and leader is 
http://nl1.index.openindex.io:8080/solr/openindex_a/
2012-07-30 09:48:36,970 INFO [solr.cloud.ZkController] - [main] - : No 
LogReplay needed for core=openindex_a 
baseURL=http://nl2.index.openindex.io:8080/solr
2012-07-30 09:48:36,970 INFO [solr.cloud.ZkController] - [main] - : Core needs 
to recover:openindex_a
{code}

Something noteworthy may be that for some reasons the index versions of all 
cores and their replica's don't match. After a restart the generation of a core 
is also different while it shouldn't have changed. The size in bytes is also 
slightly different (~20 bytes).

The main thing that's concerning that Solr consumes 5x the allocated heap space 
in the RESident memory. Caches and such are in the heap and the MMapped index 
dir should be in VIRTual memory and not cause the kernel to kill the process. 
I'm not yet sure what's going on here. Also, according to Uwe virtual memory 
should not be more than 2-3 times index size. In our case we see ~800Mb virtual 
memory for two 26Mb cores right after start up.

We have only allocated 98Mb to the heap for now and this is enough for such a 
small index.

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423841#comment-13423841
 ] 

Markus Jelsma commented on SOLR-3685:
-

Ok, i have increased my DocumentCache again to reproduce the problem and 
configured from -XX:MaxDirectMemorySize=100m to 10m but RES is still climbing 
at the same rate as before so no change. We don't use Tika only Zookeeper.

About virtual memory. That also climbes to ~800Mb which is many times more than 
the index size. There are no pending commits or merges right after start up.

There may be some cloud replication related process that eats the RAM.

Thanks

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423801#comment-13423801
 ] 

Markus Jelsma commented on SOLR-3685:
-

Hi - i don't look at virtual memory but RESident memory. My Solr install here 
will eat up to 512MB RESIDENT MEMORY and is killed by the OS. The virtual 
memory will then be almost 800MB, while both indexes are just 27MB in size. 
This sounds a lot of VIRT and RES for a tiny index and tiny heap.

Also, Solr will run fine and fast with just 100MB of memory, the index is still 
very small.

Thanks

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423794#comment-13423794
 ] 

Markus Jelsma commented on SOLR-3685:
-

Java 1.6.0-26 64bit, just as Linux.

I should also note now that i made an error in the configuration. I thought i 
had reduced the DocumentCache size to 64 but the node it was testing on had a 
size of 1024 configured and redistributed the config over the cluster via 
config bootstrap.

This still leaves the problem that Solr itself should run out of memory and not 
the OS as the cache is part of the heap. It also should clean old index 
directories. So this issue may consist of multiple problems.

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423786#comment-13423786
 ] 

Markus Jelsma commented on SOLR-3685:
-

I should have added this. I allocate just 98MB to the heap and 32 to the 
permgen so there just 130MB allocated.

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423782#comment-13423782
 ] 

Markus Jelsma commented on SOLR-3685:
-

I forgot to add that it doesn't matter if updates are sent to the cluster. A 
node will start to replicate on startup when it's update to date as well and 
crash subsequently.

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3685:


Attachment: info.log

Here's a log for a node where the Java process is being killed by the OS. I can 
reproduce this consistently.

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
> Attachments: info.log
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3685) solrcloud crashes on startup due to excessive memory consumption

2012-07-27 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3685:


Summary: solrcloud crashes on startup due to excessive memory consumption  
(was: Solr )

> solrcloud crashes on startup due to excessive memory consumption
> 
>
> Key: SOLR-3685
> URL: https://issues.apache.org/jira/browse/SOLR-3685
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.0-ALPHA
> Environment: Debian GNU/Linux Squeeze 64bit
> Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
>Reporter: Markus Jelsma
>Priority: Critical
> Fix For: 4.1
>
>
> There's a serious problem with restarting nodes, not cleaning old or unused 
> index directories and sudden replication and Java being killed by the OS due 
> to excessive memory allocation. Since SOLR-1781 was fixed index directories 
> get cleaned up when a node is being restarted cleanly, however, old or unused 
> index directories still pile up if Solr crashes or is being killed by the OS, 
> happening here.
> We have a six-node 64-bit Linux test cluster with each node having two 
> shards. There's 512MB RAM available and no swap. Each index is roughly 27MB 
> so about 50MB per node, this fits easily and works fine. However, if a node 
> is being restarted, Solr will consistently crash because it immediately eats 
> up all RAM. If swap is enabled Solr will eat an additional few 100MB's right 
> after start up.
> This cannot be solved by restarting Solr, it will just crash again and leave 
> index directories in place until the disk is full. The only way i can restart 
> a node safely is to delete the index directories and have it replicate from 
> another node. If i then restart the node it will crash almost consistently.
> I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3685) Solr

2012-07-27 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3685:
---

 Summary: Solr 
 Key: SOLR-3685
 URL: https://issues.apache.org/jira/browse/SOLR-3685
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Affects Versions: 4.0-ALPHA
 Environment: Debian GNU/Linux Squeeze 64bit
Solr 5.0-SNAPSHOT 1365667M - markus - 2012-07-25 19:09:43
Reporter: Markus Jelsma
Priority: Critical
 Fix For: 4.1


There's a serious problem with restarting nodes, not cleaning old or unused 
index directories and sudden replication and Java being killed by the OS due to 
excessive memory allocation. Since SOLR-1781 was fixed index directories get 
cleaned up when a node is being restarted cleanly, however, old or unused index 
directories still pile up if Solr crashes or is being killed by the OS, 
happening here.

We have a six-node 64-bit Linux test cluster with each node having two shards. 
There's 512MB RAM available and no swap. Each index is roughly 27MB so about 
50MB per node, this fits easily and works fine. However, if a node is being 
restarted, Solr will consistently crash because it immediately eats up all RAM. 
If swap is enabled Solr will eat an additional few 100MB's right after start up.

This cannot be solved by restarting Solr, it will just crash again and leave 
index directories in place until the disk is full. The only way i can restart a 
node safely is to delete the index directories and have it replicate from 
another node. If i then restart the node it will crash almost consistently.

I'll attach a log of one of the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-27 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423779#comment-13423779
 ] 

Markus Jelsma commented on SOLR-1781:
-

There's still a problem with old index directories not being cleaned up and 
strange replication on start up. I'll write to the ML for this, the problem is 
likely larger than just cleaning up.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-26 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423022#comment-13423022
 ] 

Markus Jelsma commented on SOLR-1781:
-

Yes, the problem no longer occurs!
Great work! Thanks

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-25 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422526#comment-13422526
 ] 

Markus Jelsma commented on SOLR-1781:
-

It seems this fixes the issue. I'll double check tomorrow!

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-25 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422308#comment-13422308
 ] 

Markus Jelsma commented on SOLR-1781:
-

Ok, I purged the logs, enabled info and started a tomcat. New indexes are 
created shortly after :
2012-07-25 10:13:36,125 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_b/data/index.20120725101231289

I'll send it right now.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-25 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422135#comment-13422135
 ] 

Markus Jelsma commented on SOLR-1781:
-

Hi,

I'll restart one node with two cores.

{code}
#cat cores/openindex_b/data/index.properties 
#index properties
#Wed Jul 25 09:58:26 UTC 2012
index=index.20120725095644707
{code}

{code}
# du -h cores/
4.0Kcores/lib
46M cores/openindex_b/data/index.20120725095644707
404Kcores/openindex_b/data/tlog
46M cores/openindex_b/data
46M cores/openindex_b
98M cores/openindex_a/data/index.20120725095843731
124Kcores/openindex_a/data/tlog
98M cores/openindex_a/data
98M cores/openindex_a
144Mcores/
{code}

2012-07-25 10:01:09,176 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_b/data/index.20120725095644707
...
2012-07-25 10:01:17,303 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_a/data/index.20120725095843731
...
2012-07-25 10:01:55,016 WARN [solr.core.SolrCore] - [RecoveryThread] - : New 
index directory detected: 
old=/opt/solr/cores/openindex_b/data/index.20120725095644707 
new=/opt/solr/cores/openindex_b/data/index.20120725100120496
...
2012-07-25 10:03:35,236 WARN [solr.core.SolrCore] - [RecoveryThread] - : New 
index directory detected: 
old=/opt/solr/cores/openindex_a/data/index.20120725100220706 
new=/opt/solr/cores/openindex_a/data/index.20120725100321897


{code}
# du -h cores/
4.0Kcores/lib
46M cores/openindex_b/data/index.20120725095644707
404Kcores/openindex_b/data/tlog
46M cores/openindex_b/data/index.20120725100120496
91M cores/openindex_b/data
91M cores/openindex_b
98M cores/openindex_a/data/index.20120725100321897
98M cores/openindex_a/data/index.20120725100220706
124Kcores/openindex_a/data/tlog
196Mcores/openindex_a/data
196Mcores/openindex_a
287Mcores/
{code}

A few minutes later we still have multiple index directories. No updates have 
been sent to the cluster during this whole scenario. Each time another 
directory appears it comes with a lot of I/O, on these RAM limited machines 
it's almost trashing because of the additional directory. It does not create 
another directory on each restart but sometimes does, it restarted the same 
machine again and now i have three dirs for each core.

I'll turn on INFO logging for the node and restart it again without deleting 
the surpluss dirs. The master and slave versions are still the same.

{code}
# du -h cores/
4.0Kcores/lib
46M cores/openindex_b/data/index.20120725100813961
42M cores/openindex_b/data/index.20120725101349376
46M cores/openindex_b/data/index.20120725095644707
46M cores/openindex_b/data/index.20120725101231289
404Kcores/openindex_b/data/tlog
46M cores/openindex_b/data/index.20120725100120496
223Mcores/openindex_b/data
223Mcores/openindex_b
98M cores/openindex_a/data/index.20120725101252920
98M cores/openindex_a/data/index.20120725100220706
124Kcores/openindex_a/data/tlog
196Mcores/openindex_a/data
196Mcores/openindex_a
418Mcores/
{code}

Maybe it cannot find the current index directory on start up (in my case).


2012-07-25 10:13:36,125 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_b/data/index.20120725101231289
2012-07-25 10:13:45,840 WARN [solr.core.SolrCore] - [main] - : New index 
directory detected: old=null 
new=/opt/solr/cores/openindex_a/data/index.20120725101252920
2012-07-25 10:15:41,393 WARN [solr.core.SolrCore] - [RecoveryThread] - : New 
index directory detected: 
old=/opt/solr/cores/openindex_b/data/index.20120725101231289 
new=/opt/solr/cores/openindex_b/data/index.20120725101349376
2012-07-25 10:15:46,895 WARN [solr.cloud.RecoveryStrategy] - [main-EventThread] 
- : Stopping recovery for core openindex_b 
zkNodeName=nl2.index.openindex.io:8080_solr_openindex_b
2012-07-25 10:15:46,952 WARN [solr.core.SolrCore] - [RecoveryThread] - : 
[openindex_a] Error opening new searcher. exceeded limit of 
maxWarmingSearchers=1, try again later.
2012-07-25 10:15:47,298 ERROR [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
- : Error while trying to recover.
org.apache.solr.common.SolrException: Error opening new searcher. exceeded 
limit of maxWarmingSearchers=1, try again later.
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1365)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1157)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:316)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:210)
2012-07-25 10:15:47,299 ERROR [solr.cloud.RecoveryStrategy] - [RecoveryThre

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-24 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421675#comment-13421675
 ] 

Markus Jelsma commented on SOLR-1781:
-

Besides search and index only the occasional restart when i change some config 
or deploy a new build. Sometimes i need to start ZK 3.4 again because it died 
for some reason. Restarting Tomcat a few times in a row may be a clue here. 
I'll check again tomorrow if whether it's consistent.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-24 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421565#comment-13421565
 ] 

Markus Jelsma commented on SOLR-1781:
-

Hi,

I've deleted both today and yesterday indexes of more than a few hours old.

Some are being removed indeed and some persist. I just restarted all nodes 
(introduced new fieldTypes and one field) and at least one node has three index 
directories. Others had two, some just one. Not a single node has a `unable to 
delete` string in the logs.

Thanks

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-24 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421323#comment-13421323
 ] 

Markus Jelsma commented on SOLR-1781:
-

One of the nodes ended up with two index directories today. Later some other 
nodes also didn't clean up after they got restarted.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-23 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420495#comment-13420495
 ] 

Markus Jelsma commented on SOLR-1781:
-

The problem is resolved. The `bad` node created several new index.[0-9] 
directories, even with this patch, and caused high I/O. I deleted the complete 
data directory so also the index.properties file. It loaded its index from the 
other nodes and no longer created many index dirs.

Thanks

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-20 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419385#comment-13419385
 ] 

Markus Jelsma commented on SOLR-1781:
-

Strange indeed. I can/could replicate it on one machine consistently and not on 
others. Machines weren't upgraded at the same time to prevent cluster downtime.

I'll check back monday, there are two other machines left to upgrade plus the 
bad node.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-20 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419127#comment-13419127
 ] 

Markus Jelsma commented on SOLR-1781:
-

Log sent.

This node has two shards on it and executed 2x 512 warmup queries which adds 
up. It won't talk to ZK (tail of the log). Restarting the node with an 18th's 
build works fine. Did it three times today.
Thanks

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-20 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419114#comment-13419114
 ] 

Markus Jelsma commented on SOLR-1781:
-

The node will never respond to HTTP requests, all ZK connections time out, very 
high resource consumption. I'll try provide a log snippet soon. I tried running 
today's build several times but one specific node refuses to `come online`. 
Another node did well and runs today's build.

I cannot attach a file to a resolved issue. Send over mail?


> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-20 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419032#comment-13419032
 ] 

Markus Jelsma commented on SOLR-1781:
-

Hi, is the core reloading still part of this? I get a lot of firstSearcher 
events on a test node now and it won't get online. Going back to July 18th 
(before this patch) build works fine. Other nodes won't come online with a 
build from the 19th (after this patch).

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0, 5.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch, 
> SOLR-1781.patch, SOLR-1781.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3644) debug 0.0 process time with distributed search

2012-07-19 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3644:
---

 Summary: debug 0.0 process time with distributed search
 Key: SOLR-3644
 URL: https://issues.apache.org/jira/browse/SOLR-3644
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other, SolrCloud
Affects Versions: 4.0-ALPHA
 Environment: 5.0.0.2012.07.18.15.28.59
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.1


With debugQuery enabled we usually see processTime information for all search 
components. With distributed search only the processTime for the query, 
highlight and debug components are non-zero.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-17 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416318#comment-13416318
 ] 

Markus Jelsma commented on SOLR-1781:
-

Perhaps they could be cleaned up on core start or after some time has passed? 

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Mark Miller
> Fix For: 4.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-17 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416183#comment-13416183
 ] 

Markus Jelsma commented on SOLR-1781:
-

We don't have that, i should have included it in my comment. All servers run 
Debian GNU/Linux 6.0 and the cloud test cluster always runs with a very recent 
build from trunk.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Noble Paul
> Fix For: 4.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1781) Replication index directories not always cleaned up

2012-07-17 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416005#comment-13416005
 ] 

Markus Jelsma commented on SOLR-1781:
-

This happens almost daily on our SolrCloud (trunk) test cluster, we sometimes 
see four surpluss index directories created in a day.

> Replication index directories not always cleaned up
> ---
>
> Key: SOLR-1781
> URL: https://issues.apache.org/jira/browse/SOLR-1781
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 1.4
> Environment: Windows Server 2003 R2, Java 6b18
>Reporter: Terje Sten Bjerkseth
>Assignee: Noble Paul
> Fix For: 4.0
>
> Attachments: 
> 0001-Replication-does-not-always-clean-up-old-directories.patch
>
>
> We had the same problem as someone described in 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201001.mbox/%3c222a518d-ddf5-4fc8-a02a-74d4f232b...@snooth.com%3e.
>  A partial copy of that message:
> We're using the new replication and it's working pretty well. There's  
> one detail I'd like to get some more information about.
> As the replication works, it creates versions of the index in the data  
> directory. Originally we had index/, but now there are dated versions  
> such as index.20100127044500/, which are the replicated versions.
> Each copy is sized in the vicinity of 65G. With our current hard drive  
> it's fine to have two around, but 3 gets a little dicey. Sometimes  
> we're finding that the replication doesn't always clean up after  
> itself. I would like to understand this better, or to not have this  
> happen. It could be a configuration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud

2012-07-12 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412927#comment-13412927
 ] 

Markus Jelsma commented on SOLR-3488:
-

Thanks for claryfing, it makes sense. About the downtime on core reload, a load 
balancer pinging Solr's admin/ping handler will definately mark the node as 
down; the ping request will time out for up to a few seconds or even longer in 
case of many firstSearcher events.



> Create a Collections API for SolrCloud
> --
>
> Key: SOLR-3488
> URL: https://issues.apache.org/jira/browse/SOLR-3488
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.0
>
> Attachments: SOLR-3488.patch, SOLR-3488.patch, SOLR-3488.patch, 
> SOLR-3488_2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud

2012-07-12 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412874#comment-13412874
 ] 

Markus Jelsma commented on SOLR-3488:
-

Is it intended for a collection RELOAD action to reload all collection cores 
immediately? That implies downtime i assume?

> Create a Collections API for SolrCloud
> --
>
> Key: SOLR-3488
> URL: https://issues.apache.org/jira/browse/SOLR-3488
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.0
>
> Attachments: SOLR-3488.patch, SOLR-3488.patch, SOLR-3488.patch, 
> SOLR-3488_2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-3564) SpellcheckCollator NPE with timeAllowed set

2012-07-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409271#comment-13409271
 ] 

Markus Jelsma edited comment on SOLR-3564 at 7/9/12 8:55 AM:
-

It it in some cases also sent over the wire in response to a legitimate request:

edit: this is actually triggered by something else. 

  was (Author: markus17):
It it in some cases also sent over the wire in response to a legitimate 
request:

{code}
{"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{"type":{},"host":{},"cat":{}},"facet_dates":{},"facet_ranges":{}},"highlighting":{},"error":{"trace":"java.lang.NullPointerException\n\tat
 
org.apache.solr.handler.component.SpellCheckComponent.finishStage(SpellCheckComponent.java:297)\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1599)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
{code}
  
> SpellcheckCollator NPE with timeAllowed set
> ---
>
> Key: SOLR-3564
> URL: https://issues.apache.org/jira/browse/SOLR-3564
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 4.0
> Environment: 5.0-SNAPSHOT 1352525M - markus - 2012-06-21 15:23:39
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 4.0
>
>
> If the query running time is exceeded during collation checking the 
> SpellcheckCollator throws the following NPE:
> {code}
> 2012-06-21 14:34:12,875 WARN [solr.spelling.SpellCheckCollator] - 
> [http-8080-exec-28] - : Exception trying to re-query to check if a spell 
> check possi
> bility would return any hits.
> java.lang.NullPointerException
> at 
> org.apache.solr.handler.component.ResponseBuilder.setResult(ResponseBuilder.java:399)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:412)
> at 
> org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112)
> at 
> org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203)
> at 
> org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3564) SpellcheckCollator NPE with timeAllowed set

2012-07-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409271#comment-13409271
 ] 

Markus Jelsma commented on SOLR-3564:
-

It it in some cases also sent over the wire in response to a legitimate request:

{code}
{"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{"type":{},"host":{},"cat":{}},"facet_dates":{},"facet_ranges":{}},"highlighting":{},"error":{"trace":"java.lang.NullPointerException\n\tat
 
org.apache.solr.handler.component.SpellCheckComponent.finishStage(SpellCheckComponent.java:297)\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1599)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
{code}

> SpellcheckCollator NPE with timeAllowed set
> ---
>
> Key: SOLR-3564
> URL: https://issues.apache.org/jira/browse/SOLR-3564
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 4.0
> Environment: 5.0-SNAPSHOT 1352525M - markus - 2012-06-21 15:23:39
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 4.0
>
>
> If the query running time is exceeded during collation checking the 
> SpellcheckCollator throws the following NPE:
> {code}
> 2012-06-21 14:34:12,875 WARN [solr.spelling.SpellCheckCollator] - 
> [http-8080-exec-28] - : Exception trying to re-query to check if a spell 
> check possi
> bility would return any hits.
> java.lang.NullPointerException
> at 
> org.apache.solr.handler.component.ResponseBuilder.setResult(ResponseBuilder.java:399)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:412)
> at 
> org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112)
> at 
> org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203)
> at 
> org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3564) SpellcheckCollator NPE with timeAllowed set

2012-06-21 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3564:
---

 Summary: SpellcheckCollator NPE with timeAllowed set
 Key: SOLR-3564
 URL: https://issues.apache.org/jira/browse/SOLR-3564
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.0
 Environment: 5.0-SNAPSHOT 1352525M - markus - 2012-06-21 15:23:39
Reporter: Markus Jelsma
 Fix For: 4.0


If the query running time is exceeded during collation checking the 
SpellcheckCollator throws the following NPE:

{code}
2012-06-21 14:34:12,875 WARN [solr.spelling.SpellCheckCollator] - 
[http-8080-exec-28] - : Exception trying to re-query to check if a spell check 
possi
bility would return any hits.
java.lang.NullPointerException
at 
org.apache.solr.handler.component.ResponseBuilder.setResult(ResponseBuilder.java:399)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:412)
at 
org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112)
at 
org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203)
at 
org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)

{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3557) Avoid NPE for distributed request when shards.tolerant=true

2012-06-19 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396638#comment-13396638
 ] 

Markus Jelsma commented on SOLR-3557:
-

Patch works fine!

> Avoid NPE for distributed request when shards.tolerant=true
> ---
>
> Key: SOLR-3557
> URL: https://issues.apache.org/jira/browse/SOLR-3557
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Ryan McKinley
>Assignee: Ryan McKinley
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3557-tolerant-faceting.patch
>
>
> If a shard fails to return and shards.tolerant=true, the faceting module will 
> get a null pointer.  We should avoid that NPE

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3518) No `hits` in SolrResp. NamedList if distrib=true

2012-06-11 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3518:


Attachment: SOLR-3518-4.0-1.patch

Patch for trunk adding the `hits` field to the SolrQueryResponse's NamedList. 
It's only returned in the final response, not in intermediate shardrequests in 
a distributed search.

Most likely not a good solution but it seems to work fine for now. Please 
improve.

> No `hits` in SolrResp. NamedList if distrib=true
> 
>
> Key: SOLR-3518
> URL: https://issues.apache.org/jira/browse/SOLR-3518
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.0
> Environment: 5.0-SNAPSHOT 1346798 - markus - 2012-06-06 11:38:15
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3518-4.0-1.patch
>
>
> The hits field in the NamedList obtained from SolrQueryResponse.toLog() is 
> not available for distrib=true requests. The hits fields is also not written 
> to the log.
> See also:: 
> http://lucene.472066.n3.nabble.com/SolrDispatchFilter-no-hits-in-response-NamedList-if-distrib-true-td3987751.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3525) Per-field similarity should display used impl. in debug output broken

2012-06-08 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3525:
---

 Summary: Per-field similarity should display used impl. in debug 
output broken
 Key: SOLR-3525
 URL: https://issues.apache.org/jira/browse/SOLR-3525
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.0
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.0


When using per-field similarity debugQuery should display the used similarity 
implementation for each match.

Right now it's broken and displays empty brackets:
112.33515 = (MATCH) weight(content:blah in 273) [], result of:

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3518) No `hits` in SolrResp. NamedList if distrib=true

2012-06-07 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3518:
---

 Summary: No `hits` in SolrResp. NamedList if distrib=true
 Key: SOLR-3518
 URL: https://issues.apache.org/jira/browse/SOLR-3518
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: 5.0-SNAPSHOT 1346798 - markus - 2012-06-06 11:38:15
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 4.0


The hits field in the NamedList obtained from SolrQueryResponse.toLog() is not 
available for distrib=true requests. The hits fields is also not written to the 
log.

See also:: 
http://lucene.472066.n3.nabble.com/SolrDispatchFilter-no-hits-in-response-NamedList-if-distrib-true-td3987751.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-05-23 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281607#comment-13281607
 ] 

Markus Jelsma commented on SOLR-3238:
-

I think it would also be useful to display the shard information in the core 
overview page such as its ID and whether it is a leader.

> Sequel of Admin UI
> --
>
> Key: SOLR-3238
> URL: https://issues.apache.org/jira/browse/SOLR-3238
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.0
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
> Fix For: 4.0
>
> Attachments: SOLR-3238.patch, SOLR-3238.patch, SOLR-3238.patch, 
> solradminbug.png
>
>
> Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3473) Distributed deduplication broken

2012-05-21 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280264#comment-13280264
 ] 

Markus Jelsma commented on SOLR-3473:
-

That makes sense indeed.

To work around the problem of having the digest field as ID, could it not 
simply issue a deleteByQuery for the digest prior to adding it? Would that 
cause significant overhead for very large systems with many updates?

We would, from Nutch' point of view, certainly want to avoid changing the ID 
from URL to digest.





> Distributed deduplication broken
> 
>
> Key: SOLR-3473
> URL: https://issues.apache.org/jira/browse/SOLR-3473
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, update
>Affects Versions: 4.0
>Reporter: Markus Jelsma
> Fix For: 4.0
>
>
> Solr's deduplication via the SignatureUpdateProcessor is broken for 
> distributed updates on SolrCloud.
> Mark Miller:
> {quote}
> Looking again at the SignatureUpdateProcessor code, I think that indeed this 
> won't currently work with distrib updates. Could you file a JIRA issue for 
> that? The problem is that we convert update commands into solr documents - 
> and that can cause a loss of info if an update proc modifies the update 
> command.
> I think the reason that you see a multiple values error when you try the 
> other order is because of the lack of a document clone (the other issue I 
> mentioned a few emails back). Addressing that won't solve your issue though - 
> we have to come up with a way to propagate the currently lost info on the 
> update command.
> {quote}
> Please see the ML thread for the full discussion: 
> http://lucene.472066.n3.nabble.com/SolrCloud-deduplication-td3984657.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3473) Distributed deduplication broken

2012-05-21 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3473:
---

 Summary: Distributed deduplication broken
 Key: SOLR-3473
 URL: https://issues.apache.org/jira/browse/SOLR-3473
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud, update
Affects Versions: 4.0
Reporter: Markus Jelsma
 Fix For: 4.0


Solr's deduplication via the SignatureUpdateProcessor is broken for distributed 
updates on SolrCloud.

Mark Miller:
{quote}
Looking again at the SignatureUpdateProcessor code, I think that indeed this 
won't currently work with distrib updates. Could you file a JIRA issue for 
that? The problem is that we convert update commands into solr documents - and 
that can cause a loss of info if an update proc modifies the update command.

I think the reason that you see a multiple values error when you try the other 
order is because of the lack of a document clone (the other issue I mentioned a 
few emails back). Addressing that won't solve your issue though - we have to 
come up with a way to propagate the currently lost info on the update command.
{quote}

Please see the ML thread for the full discussion: 
http://lucene.472066.n3.nabble.com/SolrCloud-deduplication-td3984657.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3457) Spellchecker always incorrectly spelled

2012-05-18 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3457:


Attachment: SOLR-3457-4.0-1.patch

Patch for trunk. It seems the isCorrectlySpelled flag is not correctly 
initialized. In the example samsun is incorrectly spelled, has freqInfo and 
zero suggestions so it's never set to true.

> Spellchecker always incorrectly spelled
> ---
>
> Key: SOLR-3457
> URL: https://issues.apache.org/jira/browse/SOLR-3457
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 4.0
> Environment: solr-spec 4.0.0.2012.05.15.11.42.06
> solr-impl 4.0-SNAPSHOT 1338601 - markus - 2012-05-15 11:42:06
> lucene-spec 4.0-SNAPSHOT
> lucene-impl 4.0-SNAPSHOT 1338601 - markus - 2012-05-15 10:51:02
>Reporter: Markus Jelsma
> Attachments: SOLR-3457-4.0-1.patch
>
>
> correctlySpelled is always false with default configuration, example config 
> and example documents:
> http://localhost:8983/solr/collection1/browse?wt=xml&spellcheck.extendedResults=true&q=samsung
> {code}
> 
>   
>false
>   
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-05-16 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276647#comment-13276647
 ] 

Markus Jelsma commented on SOLR-3238:
-

There's a small issue on the analysis page. After submitting the form with 
whitespace in the fields they are printed again still being url encoded with + 
for whitespace.

> Sequel of Admin UI
> --
>
> Key: SOLR-3238
> URL: https://issues.apache.org/jira/browse/SOLR-3238
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 4.0
>Reporter: Stefan Matheis (steffkes)
>Assignee: Stefan Matheis (steffkes)
> Fix For: 4.0
>
> Attachments: SOLR-3238.patch, SOLR-3238.patch, SOLR-3238.patch, 
> solradminbug.png
>
>
> Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3457) Spellchecker always incorrectly spelled

2012-05-16 Thread Markus Jelsma (JIRA)

Markus Jelsma created SOLR-3457:
---

 Summary: Spellchecker always incorrectly spelled
 Key: SOLR-3457
 URL: https://issues.apache.org/jira/browse/SOLR-3457
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.0
 Environment: solr-spec 4.0.0.2012.05.15.11.42.06
solr-impl 4.0-SNAPSHOT 1338601 - markus - 2012-05-15 11:42:06
lucene-spec 4.0-SNAPSHOT
lucene-impl 4.0-SNAPSHOT 1338601 - markus - 2012-05-15 10:51:02


Reporter: Markus Jelsma


correctlySpelled is always false with default configuration, example config and 
example documents:
http://localhost:8983/solr/collection1/browse?wt=xml&spellcheck.extendedResults=true&q=samsung

{code}

  
   false
  

{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable

2012-05-11 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273514#comment-13273514
 ] 

Markus Jelsma commented on SOLR-3221:
-

I would agree that latency is preferred as default.

> Make Shard handler threadpool configurable
> --
>
> Key: SOLR-3221
> URL: https://issues.apache.org/jira/browse/SOLR-3221
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 3.6, 4.0
>Reporter: Greg Bowyer
>Assignee: Erick Erickson
>  Labels: distributed, http, shard
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, 
> SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, 
> SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, 
> SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch
>
>
> From profiling of monitor contention, as well as observations of the
> 95th and 99th response times for nodes that perform distributed search
> (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code
> currently does a suboptimal job of managing outgoing shard level
> requests.
> Presently the code contained within lucene 3.5's SearchHandler and
> Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in
> order to service distributed search requests. This is done presently to
> limit the size of the threadpool such that it does not consume resources
> in deployment configurations that do not use distributed search.
> This unfortunately has two impacts on the response time if the node
> coordinating the distribution is under high load.
> The usage of the MaxConnectionsPerHost configuration option results in
> aggressive activity on semaphores within HttpCommons, it has been
> observed that the aggregator can have a response time far greater than
> that of the searchers. The above monitor contention would appear to
> suggest that in some cases its possible for liveness issues to occur and
> for simple queries to be starved of resources simply due to a lack of
> attention from the viewpoint of context switching.
> With, as mentioned above the http commons connection being hotly
> contended
> The fair, queue based configuration eliminates this, at the cost of
> throughput.
> This patch aims to make the threadpool largely configurable allowing for
> those using solr to choose the throughput vs latency balance they
> desire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1979) Create LanguageIdentifierUpdateProcessor

2011-09-12 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102578#comment-13102578
 ] 

Markus Jelsma commented on SOLR-1979:
-

Hi. This is not what i understood from reading the wiki doc. Will the update 
processor skip detection with these settings? It's rather costly on many docs.

Anyway, this is great work already, thanks!

> Create LanguageIdentifierUpdateProcessor
> 
>
> Key: SOLR-1979
> URL: https://issues.apache.org/jira/browse/SOLR-1979
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Minor
>  Labels: UpdateProcessor
> Fix For: 3.5
>
> Attachments: SOLR-1979.patch, SOLR-1979.patch, SOLR-1979.patch, 
> SOLR-1979.patch, SOLR-1979.patch, SOLR-1979.patch, SOLR-1979.patch
>
>
> Language identification from document fields, and mapping of field names to 
> language-specific fields based on detected language.
> Wrap the Tika LanguageIdentifier in an UpdateProcessor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1979) Create LanguageIdentifierUpdateProcessor

2011-09-12 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102520#comment-13102520
 ] 

Markus Jelsma commented on SOLR-1979:
-

Hi Jan,

Can we also use the mapping feature without detection? Our detection is done in 
a Nutch cluster so we already identified many millions of docs.

Thanks

> Create LanguageIdentifierUpdateProcessor
> 
>
> Key: SOLR-1979
> URL: https://issues.apache.org/jira/browse/SOLR-1979
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Minor
>  Labels: UpdateProcessor
> Fix For: 3.5
>
> Attachments: SOLR-1979.patch, SOLR-1979.patch, SOLR-1979.patch, 
> SOLR-1979.patch, SOLR-1979.patch, SOLR-1979.patch, SOLR-1979.patch
>
>
> Language identification from document fields, and mapping of field names to 
> language-specific fields based on detected language.
> Wrap the Tika LanguageIdentifier in an UpdateProcessor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1863) spellchecker leaks file on core reload

2011-08-23 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089390#comment-13089390
 ] 

Markus Jelsma commented on SOLR-1863:
-

Is this still an issue in 3.3 or 4.x?

> spellchecker leaks file on core reload
> --
>
> Key: SOLR-1863
> URL: https://issues.apache.org/jira/browse/SOLR-1863
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
> Environment: linux i386 (ubuntu 8.04)
>Reporter: Arne de Bruijn
> Attachments: SOLR-1863.patch
>
>
> When reloading a core of a multicore solr 1.4.0 instance with 
> /admin/cores?action=RELOAD&core=name an extra reference to the spellchecker 
> cfs file appears in the list of open files of the java process. A forced gc 
> (with jconsole) does not help.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2689) !frange with query($qq) sets score=1.0f for all returned documents

2011-08-02 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078467#comment-13078467
 ] 

Markus Jelsma commented on SOLR-2689:
-

You are right, it's because both examples use one search term and thus all have 
the same score. It shows when not all scores are identical when you use 
multiple terms. I'll provide a better description and example next week when 
i'll get back.

> !frange with query($qq) sets score=1.0f for all returned documents
> --
>
> Key: SOLR-2689
> URL: https://issues.apache.org/jira/browse/SOLR-2689
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.4
>Reporter: Markus Jelsma
> Fix For: 3.4, 4.0
>
>
> Consider the following queries, both query the default field for 'test' and 
> return the document digest and score (i don't seem to be able get only score, 
> fl=score returns all fields):
> This is a normal query and yields normal results with proper scores:
> {code}
> q=test&fl=digest,score
> {code}
> {code}
> 
> −
> 
> 4.952673
> c48e784f06a051d89f20b72194b0dcf0
> 
> −
> 
> 4.952673
> 7f78a504b8cbd86c6cdbf2aa2c4ae5e3
> 
> −
> 
> 4.952673
> 0f7fefa6586ceda42fc1f095d460aa17
> 
> {code}
> This query uses frange with query() to limit the number of returned 
> documents. When using multiple search terms i can indeed cut-off the result 
> set but in the end all returned documents have score=1.0f. The final result 
> set cannot be sorted by score anymore. The result set seems to be returned in 
> the order of Lucene docId's.
> {code}
> q={!frange l=1.23}query($qq)&qq=test&fl=digest,score
> {code}
> {code}
> 
> −
> 
> 1.0
> c48e784f06a051d89f20b72194b0dcf0
> 
> −
> 
> 1.0
> 7f78a504b8cbd86c6cdbf2aa2c4ae5e3
> 
> −
> 
> 1.0
> 0f7fefa6586ceda42fc1f095d460aa17
> 
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2689) !frange with query($qq) sets score=1.0f for all returned documents

2011-08-02 Thread Markus Jelsma (JIRA)

!frange with query($qq) sets score=1.0f for all returned documents
--

 Key: SOLR-2689
 URL: https://issues.apache.org/jira/browse/SOLR-2689
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.4
Reporter: Markus Jelsma
 Fix For: 3.4, 4.0


Consider the following queries, both query the default field for 'test' and 
return the document digest and score (i don't seem to be able get only score, 
fl=score returns all fields):

This is a normal query and yields normal results with proper scores:

{code}
q=test&fl=digest,score
{code}

{code}

−

4.952673
c48e784f06a051d89f20b72194b0dcf0

−

4.952673
7f78a504b8cbd86c6cdbf2aa2c4ae5e3

−

4.952673
0f7fefa6586ceda42fc1f095d460aa17

{code}

This query uses frange with query() to limit the number of returned documents. 
When using multiple search terms i can indeed cut-off the result set but in the 
end all returned documents have score=1.0f. The final result set cannot be 
sorted by score anymore. The result set seems to be returned in the order of 
Lucene docId's.

{code}
q={!frange l=1.23}query($qq)&qq=test&fl=digest,score
{code}

{code}

−

1.0
c48e784f06a051d89f20b72194b0dcf0

−

1.0
7f78a504b8cbd86c6cdbf2aa2c4ae5e3

−

1.0
0f7fefa6586ceda42fc1f095d460aa17

{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2555) Always incorrectly spelled with onlyMorePopular enabled

2011-07-27 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2555:


Fix Version/s: 3.4

> Always incorrectly spelled with onlyMorePopular enabled
> ---
>
> Key: SOLR-2555
> URL: https://issues.apache.org/jira/browse/SOLR-2555
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 3.1
>Reporter: Markus Jelsma
> Fix For: 3.4
>
>
> The spellcheck component will always mark the term(s) as incorrectly spelled 
> when onlyMorePopular=true, regardless of the term being actually spelled 
> correctly.
> The onlyMorePopular setting can produce collations while the term(s) are 
> correctly spelled. This is fine behaviour. The problem is that is also marks 
> the term(s) as incorrectly spelled when they're actually in the spellcheck 
> index.
> See also this thread:
> http://lucene.472066.n3.nabble.com/correctlySpelled-and-onlyMorePopular-in-3-1-td2975773.html#a2984201

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2556) Spellcheck component not returned with numeric queries

2011-07-27 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2556:


Fix Version/s: 3.4

> Spellcheck component not returned with numeric queries
> --
>
> Key: SOLR-2556
> URL: https://issues.apache.org/jira/browse/SOLR-2556
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 3.1
>Reporter: Markus Jelsma
> Fix For: 3.4
>
>
> The spell check component's output is not written when sending queries that 
> consist of numbers only. Clients depending on the availability of the 
> spellcheck output need to check if the output is actually there.
> See also:
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201105.mbox/%3c201105301607.41956.markus.jel...@openindex.io%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2662) QueryResultCache is obligatory

2011-07-22 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069524#comment-13069524
 ] 

Markus Jelsma commented on SOLR-2662:
-

The following may be related. Below the config, just an attempt to disable the 
query cache.

{code}

{code}

Here's what happens when looking at the stats:

{code}
Concurrent LRU Cache(maxSize=2, initialSize=0, minSize=1, acceptableSize=1, 
cleanupThread=false)
{code}

Strange, it's actually in use and it even works!

{code}
lookups : 41
hits : 0
hitratio : 0.00
inserts : 0
evictions : 0
size : 0
warmupTime : 0
cumulative_lookups : 145
cumulative_hits : 2
cumulative_hitratio : 0.01
cumulative_inserts : 2
cumulative_evictions : 0 
{code}

> QueryResultCache is obligatory
> --
>
> Key: SOLR-2662
> URL: https://issues.apache.org/jira/browse/SOLR-2662
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.3
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 3.4, 4.0
>
>
> When the queryResultCache is not defined in the configuration, the start 
> parameter increments the rows parameter. Start + rows is returned and start 
> is always 0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2662) QueryResultCache is obligatory

2011-07-18 Thread Markus Jelsma (JIRA)

QueryResultCache is obligatory
--

 Key: SOLR-2662
 URL: https://issues.apache.org/jira/browse/SOLR-2662
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.3
Reporter: Markus Jelsma
Priority: Minor
 Fix For: 3.4


When the queryResultCache is not defined in the configuration, the start 
parameter increments the rows parameter. Start + rows is returned and start is 
always 0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2545) Allow equals-sign in key of external file field

2011-07-15 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066149#comment-13066149
 ] 

Markus Jelsma commented on SOLR-2545:
-

Great work! Thanks!

> Allow equals-sign in key of external file field
> ---
>
> Key: SOLR-2545
> URL: https://issues.apache.org/jira/browse/SOLR-2545
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Reporter: Markus Jelsma
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: SOLR-2545.patch
>
>
> The external file field doesn't allow an equals-sign in the key. Instead of 
> going through the hassle of escaping, this patch just uses lastIndexOf to get 
> the float value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2545) Allow equals-sign in key of external file field

2011-07-14 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2545:


Fix Version/s: 3.4
   3.2.1
   3.1.1

> Allow equals-sign in key of external file field
> ---
>
> Key: SOLR-2545
> URL: https://issues.apache.org/jira/browse/SOLR-2545
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 3.1
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 3.1.1, 3.2.1, 3.4
>
> Attachments: SOLR-2545.patch
>
>
> The external file field doesn't allow an equals-sign in the key. Instead of 
> going through the hassle of escaping, this patch just uses lastIndexOf to get 
> the float value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2555) Always incorrectly spelled with onlyMorePopular enabled

2011-05-30 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041298#comment-13041298
 ] 

Markus Jelsma commented on SOLR-2555:
-

It's not the suggester we use but the plain old spellcheck component. As far is 
i remember (checked last week) suggester doesn't return a correctlySpelled 
parameter. Spellchecker also doesn't take a LookupImpl parameter (according to 
the wiki). It's an IndexBasedSpellchecker.

> Always incorrectly spelled with onlyMorePopular enabled
> ---
>
> Key: SOLR-2555
> URL: https://issues.apache.org/jira/browse/SOLR-2555
> Project: Solr
>  Issue Type: Bug
>  Components: spellchecker
>Affects Versions: 3.1
>Reporter: Markus Jelsma
>
> The spellcheck component will always mark the term(s) as incorrectly spelled 
> when onlyMorePopular=true, regardless of the term being actually spelled 
> correctly.
> The onlyMorePopular setting can produce collations while the term(s) are 
> correctly spelled. This is fine behaviour. The problem is that is also marks 
> the term(s) as incorrectly spelled when they're actually in the spellcheck 
> index.
> See also this thread:
> http://lucene.472066.n3.nabble.com/correctlySpelled-and-onlyMorePopular-in-3-1-td2975773.html#a2984201

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2556) Spellcheck component not returned with numeric queries

2011-05-30 Thread Markus Jelsma (JIRA)

Spellcheck component not returned with numeric queries
--

 Key: SOLR-2556
 URL: https://issues.apache.org/jira/browse/SOLR-2556
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 3.1
Reporter: Markus Jelsma


The spell check component's output is not written when sending queries that 
consist of numbers only. Clients depending on the availability of the 
spellcheck output need to check if the output is actually there.


See also:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201105.mbox/%3c201105301607.41956.markus.jel...@openindex.io%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2555) Always incorrectly spelled with onlyMorePopular enabled

2011-05-30 Thread Markus Jelsma (JIRA)

Always incorrectly spelled with onlyMorePopular enabled
---

 Key: SOLR-2555
 URL: https://issues.apache.org/jira/browse/SOLR-2555
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 3.1
Reporter: Markus Jelsma


The spellcheck component will always mark the term(s) as incorrectly spelled 
when onlyMorePopular=true, regardless of the term being actually spelled 
correctly.

The onlyMorePopular setting can produce collations while the term(s) are 
correctly spelled. This is fine behaviour. The problem is that is also marks 
the term(s) as incorrectly spelled when they're actually in the spellcheck 
index.

See also this thread:
http://lucene.472066.n3.nabble.com/correctlySpelled-and-onlyMorePopular-in-3-1-td2975773.html#a2984201

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2105) Rename RequestHandler param 'update.processor' to 'update.chain'.

2011-05-25 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039360#comment-13039360
 ] 

Markus Jelsma commented on SOLR-2105:
-

Excellent job for printing the deprecation warning, i seem to have overlooked 
this issue!

> Rename RequestHandler param 'update.processor' to 'update.chain'.
> -
>
> Key: SOLR-2105
> URL: https://issues.apache.org/jira/browse/SOLR-2105
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 1.4.1
>Reporter: Jan Høydahl
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2105.patch, SOLR-2105.patch, SOLR-2105.patch
>
>
> Today we reference a custom updateRequestProcessorChain using the update 
> request parameter "update.processor".
> See 
> http://wiki.apache.org/solr/SolrConfigXml#UpdateRequestProcessorChain_section
> This is confusing, since what we are really referencing is not an 
> UpdateProcessor, but an updateRequestProcessorChain.
> I propose that "update.processor" is renamed as "update.chain" or similar

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2545) Allow equals-sign in key of external file field

2011-05-25 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2545:


Attachment: SOLR-2545.patch

> Allow equals-sign in key of external file field
> ---
>
> Key: SOLR-2545
> URL: https://issues.apache.org/jira/browse/SOLR-2545
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 3.1
>Reporter: Markus Jelsma
>Priority: Minor
> Attachments: SOLR-2545.patch
>
>
> The external file field doesn't allow an equals-sign in the key. Instead of 
> going through the hassle of escaping, this patch just uses lastIndexOf to get 
> the float value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2545) Allow equals-sign in key of external file field

2011-05-25 Thread Markus Jelsma (JIRA)

Allow equals-sign in key of external file field
---

 Key: SOLR-2545
 URL: https://issues.apache.org/jira/browse/SOLR-2545
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 3.1
Reporter: Markus Jelsma
Priority: Minor
 Attachments: SOLR-2545.patch

The external file field doesn't allow an equals-sign in the key. Instead of 
going through the hassle of escaping, this patch just uses lastIndexOf to get 
the float value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2442) Different cores use the same admin-extra.html

2011-03-25 Thread Markus Jelsma (JIRA)

Different cores use the same admin-extra.html
-

 Key: SOLR-2442
 URL: https://issues.apache.org/jira/browse/SOLR-2442
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 1.4.1
Reporter: Markus Jelsma
Priority: Minor


Solr loads a single (or overwrites) admin-extra.html for all cores. The core 
specified last in solr.xml is used to load the admin-extra.html file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-2327) java.lang.ArithmeticException: / by zero with queryResultCache size=0 and queryResultWindowSize=0

2011-01-21 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2327:


Fix Version/s: 3.1
 Priority: Minor  (was: Major)
Affects Version/s: 3.1

> java.lang.ArithmeticException: / by zero with queryResultCache size=0 and 
> queryResultWindowSize=0
> -
>
> Key: SOLR-2327
> URL: https://issues.apache.org/jira/browse/SOLR-2327
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.4.1, 3.1
>Reporter: Markus Jelsma
>Priority: Minor
> Fix For: 3.1
>
>
> With the following configuration:
>  autowarmCount="0"/>
> 0
> The following exception will occur:
> 2011-01-21 13:48:13,599 ERROR [solr.core.SolrCore] - [http-8080-1] - : 
> java.lang.ArithmeticException: / by zero
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:833)
> at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
> at 
> org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:63)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
> at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
> at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
> at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2327) java.lang.ArithmeticException: / by zero with queryResultCache size=0 and queryResultWindowSize=0

2011-01-21 Thread Markus Jelsma (JIRA)

java.lang.ArithmeticException: / by zero with queryResultCache size=0 and 
queryResultWindowSize=0
-

 Key: SOLR-2327
 URL: https://issues.apache.org/jira/browse/SOLR-2327
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.4.1
Reporter: Markus Jelsma


With the following configuration:


0

The following exception will occur:

2011-01-21 13:48:13,599 ERROR [solr.core.SolrCore] - [http-8080-1] - : 
java.lang.ArithmeticException: / by zero
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:833)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
at 
org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:63)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:619)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-2323) Solr should clean old replication temp dirs

2011-01-19 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2323:


Description: 
In a high commit rate environment (polling < 10s and commits every minute) the 
shutdown/restart of a slave can result in old temp directories laying around, 
filling up disk space as we go on. This happens with the following scenario:

1. master has index version 2
2. slave downloads files for version 2 to index.2 temp directory
3. slave is shutdown
4. master increments to version 3
5. slave is started
6. slave downloads files for version 3 to index.3 temp directory

The result is index.2 temp directory not getting deleted by any process. This 
is very annoying in such an environment where nodes are restarted frequently 
(for whatever reason). Working around the problem can be done by either 
manually deleting the temp directories between shutdown and startup or by 
calling the disablepoll command followed by an abortfetch command which will 
(after a long wait) finally purge the temp directory.

See this thread:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html

  was:
In a high commit rate environment (polling < 10s and commits every minute) the 
shutdown/restart of a slave can result in old temp directories laying around, 
filling up disk space as we go on. This happens with the following scenario:

1. master has index version 2
2. slave downloads files for version 2 to index.2 temp directory
3. slave is shutdown
4. master increments to version 3
5. slave is started
6. slave downloads files for version 3 to index.2 temp directory

The result is index.2 temp directory not getting deleted by any process. This 
is very annoying in such an environment where nodes are restarted frequently 
(for whatever reason). Working around the problem can be done by either 
manually deleting the temp directories between shutdown and startup or by 
calling the disablepoll command followed by an abortfetch command which will 
(after a long wait) finally purge the temp directory.

See this thread:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html


> Solr should clean old replication temp dirs
> ---
>
> Key: SOLR-2323
> URL: https://issues.apache.org/jira/browse/SOLR-2323
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 1.4.1
>Reporter: Markus Jelsma
>
> In a high commit rate environment (polling < 10s and commits every minute) 
> the shutdown/restart of a slave can result in old temp directories laying 
> around, filling up disk space as we go on. This happens with the following 
> scenario:
> 1. master has index version 2
> 2. slave downloads files for version 2 to index.2 temp directory
> 3. slave is shutdown
> 4. master increments to version 3
> 5. slave is started
> 6. slave downloads files for version 3 to index.3 temp directory
> The result is index.2 temp directory not getting deleted by any process. This 
> is very annoying in such an environment where nodes are restarted frequently 
> (for whatever reason). Working around the problem can be done by either 
> manually deleting the temp directories between shutdown and startup or by 
> calling the disablepoll command followed by an abortfetch command which will 
> (after a long wait) finally purge the temp directory.
> See this thread:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Updated: (SOLR-2323) Solr should clean old replication temp dirs

2011-01-19 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-2323:


Description: 
In a high commit rate environment (polling < 10s and commits every minute) the 
shutdown/restart of a slave can result in old temp directories laying around, 
filling up disk space as we go on. This happens with the following scenario:

1. master has index version 2
2. slave downloads files for version 2 to index.2 temp directory
3. slave is shutdown
4. master increments to version 3
5. slave is started
6. slave downloads files for version 3 to index.2 temp directory

The result is index.2 temp directory not getting deleted by any process. This 
is very annoying in such an environment where nodes are restarted frequently 
(for whatever reason). Working around the problem can be done by either 
manually deleting the temp directories between shutdown and startup or by 
calling the disablepoll command followed by an abortfetch command which will 
(after a long wait) finally purge the temp directory.

See this thread:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html

  was:
In a high commit rate environment (polling < 10s and commits every minute) the 
shutdown/restart of a slave can result in old temp directories laying around, 
filling up disk space as we go on. This happens with the following scenario:

1. master has index version 2
2. slave downloads files for version 2 to index.2 temp directory
3. slave is shutdown
4. master increments to version 3
5. slave downloads files for version 3 to index.2 temp directory

The result is index.2 temp directory not getting deleted by any process. This 
is very annoying in such an environment where nodes are restarted frequently 
(for whatever reason). Working around the problem can be done by either 
manually deleting the temp directories between shutdown and startup or by 
calling the disablepoll command followed by an abortfetch command which will 
(after a long wait) finally purge the temp directory.

See this thread:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html


> Solr should clean old replication temp dirs
> ---
>
> Key: SOLR-2323
> URL: https://issues.apache.org/jira/browse/SOLR-2323
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 1.4.1
>Reporter: Markus Jelsma
>
> In a high commit rate environment (polling < 10s and commits every minute) 
> the shutdown/restart of a slave can result in old temp directories laying 
> around, filling up disk space as we go on. This happens with the following 
> scenario:
> 1. master has index version 2
> 2. slave downloads files for version 2 to index.2 temp directory
> 3. slave is shutdown
> 4. master increments to version 3
> 5. slave is started
> 6. slave downloads files for version 3 to index.2 temp directory
> The result is index.2 temp directory not getting deleted by any process. This 
> is very annoying in such an environment where nodes are restarted frequently 
> (for whatever reason). Working around the problem can be done by either 
> manually deleting the temp directories between shutdown and startup or by 
> calling the disablepoll command followed by an abortfetch command which will 
> (after a long wait) finally purge the temp directory.
> See this thread:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2323) Solr should clean old replication temp dirs

2011-01-19 Thread Markus Jelsma (JIRA)

Solr should clean old replication temp dirs
---

 Key: SOLR-2323
 URL: https://issues.apache.org/jira/browse/SOLR-2323
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4.1
Reporter: Markus Jelsma


In a high commit rate environment (polling < 10s and commits every minute) the 
shutdown/restart of a slave can result in old temp directories laying around, 
filling up disk space as we go on. This happens with the following scenario:

1. master has index version 2
2. slave downloads files for version 2 to index.2 temp directory
3. slave is shutdown
4. master increments to version 3
5. slave downloads files for version 3 to index.2 temp directory

The result is index.2 temp directory not getting deleted by any process. This 
is very annoying in such an environment where nodes are restarted frequently 
(for whatever reason). Working around the problem can be done by either 
manually deleting the temp directories between shutdown and startup or by 
calling the disablepoll command followed by an abortfetch command which will 
(after a long wait) finally purge the temp directory.

See this thread:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg45120.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-2277) Update with add and delete combined fails

2010-12-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969786#action_12969786
 ] 

Markus Jelsma commented on SOLR-2277:
-

Ah, an undocumented feature. I've added this to the wiki. I assume this ticket 
can be closed?
http://wiki.apache.org/solr/UpdateXmlMessages#Add_and_delete_in_a_single_batch

> Update with add and delete combined fails
> -
>
> Key: SOLR-2277
> URL: https://issues.apache.org/jira/browse/SOLR-2277
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 1.4.1
>Reporter: Markus Jelsma
>
> The following curl command:
> curl http://127.0.0.1:8983/solr/update/?commit=true -H "Content-Type: 
> text/xml" --data-binary ' name="id">171234'; 
> will trigger the following exception in Solr 1.4.1:
> Dec 9, 2010 12:51:22 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots 
> (start tag in epilog?).
>  at [row,col {unknown-source}]: [47,2]
> at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
> at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
> at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
> at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
> roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [47,2]
> at 
> com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
> at 
> com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
> at 
> com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
> at 
> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
> at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1071)
> at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
> at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
> ... 22 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2278) PHPSerialized fails with Solr spatial

2010-12-09 Thread Markus Jelsma (JIRA)

PHPSerialized fails with Solr spatial
-

 Key: SOLR-2278
 URL: https://issues.apache.org/jira/browse/SOLR-2278
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Affects Versions: 1.4.1
Reporter: Markus Jelsma


Solr throws a java.lang.IllegalArgumentException: Map size must not be negative 
exception when using the PHP Serialized response writer with JTeam SolrSpatial 
plugin in front. At first it may seem a bug in the plugin but according to some 
posts in the mailing list thread ( 
http://lucene.472066.n3.nabble.com/Map-size-must-not-be-negative-with-spatial-results-php-serialized-td2039782.html
 ) it just might be a bug in Solr.

The only way to reproduce the issue that i know of is using using LocalParams 
to set spatial parameters and having the spatial search component activated as 
last-components. If the query yields no results, the exception won't show up.

  
  

  distance

  

  
  
1
1
60
ad_latitude
ad_longitude
_tier_
  

In the request handler:

  geodistance


query:
http://localhost:8983/solr/search?q={!spatial%20lat=51.9562%20long=6.02606%20radius=432%20unit=km}auto&wt=php


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Created: (SOLR-2277) Update with add and delete combined fails

2010-12-09 Thread Markus Jelsma (JIRA)

Update with add and delete combined fails
-

 Key: SOLR-2277
 URL: https://issues.apache.org/jira/browse/SOLR-2277
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.4.1
Reporter: Markus Jelsma


The following curl command:
curl http://127.0.0.1:8983/solr/update/?commit=true -H "Content-Type: text/xml" 
--data-binary '171234'; 

will trigger the following exception in Solr 1.4.1:
Dec 9, 2010 12:51:22 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots 
(start tag in epilog?).
 at [row,col {unknown-source}]: [47,2]
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
roots (start tag in epilog?).
 at [row,col {unknown-source}]: [47,2]
at 
com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
at 
com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
at 
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1071)
at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
... 22 more


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)

2010-12-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969704#action_12969704
 ] 

Markus Jelsma commented on SOLR-1752:
-

This isn't limited to SolrJ. The following curl command will trigger the same 
error in 1.4.1
curl http://127.0.0.1:8983/solr/update/?commit=true -H "Content-Type: text/xml" 
--data-binary '171234';



> SolrJ fails with exception when passing document ADD and DELETEs in the same 
> request using XML request writer (but not binary request writer)
> -
>
> Key: SOLR-1752
> URL: https://issues.apache.org/jira/browse/SOLR-1752
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java, update
>Affects Versions: 1.4
>Reporter: Jayson Minard
>Assignee: Shalin Shekhar Mangar
>Priority: Blocker
>
> Add this test to SolrExampleTests.java and it will fail when using the XML 
> Request Writer (now default), but not if you change the SolrExampleJettyTest 
> to use the BinaryRequestWriter.
> {code}
>  public void testAddDeleteInSameRequest() throws Exception {
> SolrServer server = getSolrServer();
> SolrInputDocument doc3 = new SolrInputDocument();
> doc3.addField( "id", "id3", 1.0f );
> doc3.addField( "name", "doc3", 1.0f );
> doc3.addField( "price", 10 );
> UpdateRequest up = new UpdateRequest();
> up.add( doc3 );
> up.deleteById("id001");
> up.setWaitFlush(false);
> up.setWaitSearcher(false);
> up.process( server );
>   }
> {code}
> terminates with exception:
> {code}
> Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots 
> (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,125]
>   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>   at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>   at org.mortbay.jetty.Server.handle(Server.java:285)
>   at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>   at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
>   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723)
>   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
>   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>   at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>   at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple 
> roots (start tag in epilog?).
>  at [row,col {unknown-source}]: [1,125]
>   at 
> com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
>   at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
>   at 
> com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
>   at 
> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
>   at 
> com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647)
>   at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
>   at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
>   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
>   ... 18 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

< 2 3 4 5 6 7 8 >

601 - 700 of 700 matches

Mail list logo