[jira] [Commented] (SOLR-4234) ZooKeeper doesn't handle binary files

2012-12-27 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539870#comment-13539870
 ] 

Erik Hatcher commented on SOLR-4234:


lol - while I like /admin/file and of course mixing and matching it with the 
VelocityResponseWriter, I cringe at ZK serving up favicons.

But hey, better that it works for binary files than didn't, so +1 :)

One minor issue with the patch is the incorrect javadoc added to the 
ByteArrayStream ctor Construct a codeContentStream/code from a 
codeFile/code, so let's get that correct.

 ZooKeeper doesn't handle binary files
 -

 Key: SOLR-4234
 URL: https://issues.apache.org/jira/browse/SOLR-4234
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Eric Pugh
Assignee: Mark Miller
 Fix For: 4.1, 5.0

 Attachments: binary_upload_download.patch, 
 fix_show_file_handler_with_binaries.patch


 I was attempting to get the ShowFileHandler to show a .png file, and it was 
 failing.  But in non-ZK mode it worked just fine!   It took a while, but it 
 seems that we upload to zk as a text, and download as well.  I've attached a 
 unit test that demonstrates the problem, and a fix.  You have to have a 
 binary file in the conf directory to make the test work, I put solr.png in 
 the collection1/conf/velocity directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_37) - Build # 3446 - Failure!

2012-12-27 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/3446/
Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 8786 lines...]
[junit4:junit4] ERROR: JVM J1 ended with an exception, command line: 
/mnt/ssd/jenkins/tools/java/32bit/jdk1.6.0_37/jre/bin/java -client 
-XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/heapdumps 
-Dtests.prefix=tests -Dtests.seed=31765500A169F030 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.locale=random -Dtests.timezone=random 
-Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz 
-Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/testlogging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=3 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Dfile.encoding=ISO-8859-1 -classpath 

[jira] [Commented] (SOLR-4016) Deduplication is broken by partial update

2012-12-27 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539959#comment-13539959
 ] 

Shalin Shekhar Mangar commented on SOLR-4016:
-

All these problems go away if we add the DistributedUpdateProcessorFactory 
ahead of all other processors instead of adding it just before 
RunUpdateProcessorFactory by default.

[~yo...@apache.org], is there a reason why we add DUPF at the last? It seems 
that other processors such as RegexReplaceProcessorFactory will also be 
affected.

 Deduplication is broken by partial update
 -

 Key: SOLR-4016
 URL: https://issues.apache.org/jira/browse/SOLR-4016
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.0
 Environment: Tomcat6 / Catalina on Ubuntu 12.04 LTS
Reporter: Joel Nothman
Assignee: Shalin Shekhar Mangar
  Labels: 4.0.1_Candidate
 Fix For: 4.1, 5.0


 The SignatureUpdateProcessorFactory used (primarily?) for deduplication does 
 not consider partial update semantics.
 The below uses the following solrconfig.xml excerpt:
 {noformat}
  updateRequestProcessorChain name=text_hash
processor class=solr.processor.SignatureUpdateProcessorFactory
  bool name=enabledtrue/bool
  str name=signatureFieldtext_hash/str
  bool name=overwriteDupesfalse/bool
  str name=fieldstext/str
  str name=signatureClasssolr.processor.TextProfileSignature/str
/processor
processor class=solr.LogUpdateProcessorFactory /
processor class=solr.RunUpdateProcessorFactory /
  /updateRequestProcessorChain
 {noformat}
 Firstly, the processor treats {noformat}{set: value}{noformat} as a 
 string and hashes it, instead of the value alone:
 {noformat}
 $ curl '$URL/update?commit=true' -H 'Content-type:application/json' -d 
 '{add:{doc:{id: abcde, text: {set: hello world'  curl 
 '$URL/select?q=id:abcde'
 {responseHeader:{status:0,QTime:30}}
 ?xml version=1.0 encoding=UTF-8?responselst 
 name=responseHeaderint name=status0/intint name=QTime1/intlst 
 name=paramsstr name=qid:abcde/str/lst/lstresult name=response 
 numFound=1 start=0docstr name=idabcde/strstr name=texthello 
 world/strstr name=text_hashad48c7ad60ac22cc/strlong 
 name=_version_1417247434224959488/long/doc/result
 /response
 $
 $ curl '$URL/update?commit=true' -H 'Content-type:application/json' -d 
 '{add:{doc:{id: abcde, text: hello world}}}'  curl 
 '$URL/select?q=id:abcde'
 {responseHeader:{status:0,QTime:27}}
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint 
 name=QTime1/intlst name=paramsstr 
 name=qid:abcde/str/lst/lstresult name=response numFound=1 
 start=0docstr name=idabcde/strstr name=texthello 
 world/strstr name=text_hashb169c743d220da8d/strlong 
 name=_version_141724802221564/long/doc/result
 /response
 {noformat}
 Note the different text_hash value.
 Secondly, when updating a field other than those used to create the signature 
 (which I imagine is a more common use-case), the signature is recalculated 
 from no values:
 {noformat}
 $ curl '$URL/update?commit=true' -H 'Content-type:application/json' -d 
 '{add:{doc:{id: abcde, title: {set: new title'  curl 
 '$URL/select?q=id:abcde'
 {responseHeader:{status:0,QTime:39}}
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint 
 name=QTime1/intlst name=paramsstr 
 name=qid:abcde/str/lst/lstresult name=response numFound=1 
 start=0docstr name=idabcde/strstr name=texthello 
 world/strstr name=text_hash/strstr name=titlenew 
 title/strlong name=_version_1417248120480202752/long/doc/result
 /response
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-12-27 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539978#comment-13539978
 ] 

Shawn Heisey commented on SOLR-1972:


Lance, if I am reading OnlineSummarizer right, it only gives you five 
percentile numbers - 0% (min), 50% (median), 100% (max), and two others that 
are not explicitly quantified, but I am guessing are 25% and 75%.  The 95% and 
99% points are not present, and I would argue strongly that those are the most 
useful numbers currently available.  Median is important, but 95% and 99% are 
the numbers that will show problems first.


 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Shawn Heisey
Assignee: Alan Woodward
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-trunk.patch, elyograg-1972-trunk.patch, leak-closeable.patch, 
 leak.patch, revert-SOLR-1972.patch, SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-branch4x.patch, SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, solr1972-metricsregistry-branch4x-failure.log, 
 SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972-url_pattern.patch, stacktraces.tar.gz


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4236) Commit issue: Can't search while add commit=true in the call URL about insert index

2012-12-27 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539987#comment-13539987
 ] 

Mark Miller commented on SOLR-4236:
---

Hey Raintung, I have not had a chance to fully understand this yet, but is this 
issue a dupe of SOLR-3933?

 Commit issue: Can't search while add commit=true in the call URL about insert 
 index
 ---

 Key: SOLR-4236
 URL: https://issues.apache.org/jira/browse/SOLR-4236
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
 Environment: one collection, one shard, three sever, one leader, two 
 duplicate
Reporter: Raintung Li
  Labels: commit

 I setup three instances for solr cloud for one same collection and shards, 
 the cloud is one instance is shard leader and the others are replicate.
 Send the index request to one instance, the URL example like this.
 curl http://localhost:7002/solr/update?commit=true; -H Content-Type: 
 text/xml --data-binary 'adddocfield name=idtest/field/doc/add'
 If send the request to the leader server, only the leader server can search 
 this index, the replicate can't search. I close the autoSoftCommit. 
 If request send to the replicate server, all servers can't search this index.
 The major problem:
 SolrCmdDistributor
 distribAdd method will batch some requests in the cache.
 DistributedUpdateProcessor class method processCommit will trigger the send 
 the distribute request after the send commit request. 
 If send the testing index's request to replicate server, replicate server 
 will dispatch the request to leader server. But in this case, commit command 
 will send to the other server before actually index request. The index can be 
 searched only wait the softCommit or the other commit command coming.
 A litter confuse: Why commit command don't need the leader server send to 
 duplicate server? Only receive request server send the commit to full shards 
 server?
 It look like solr doesn't implement the transaction logic. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-12-27 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1353#comment-1353
 ] 

Commit Tag Bot commented on SOLR-1972:
--

[trunk commit] Uwe Schindler
http://svn.apache.org/viewvc?view=revisionrevision=1426230

Revert SOLR-1972


 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Shawn Heisey
Assignee: Alan Woodward
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-trunk.patch, elyograg-1972-trunk.patch, leak-closeable.patch, 
 leak.patch, revert-SOLR-1972.patch, SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-branch4x.patch, SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, solr1972-metricsregistry-branch4x-failure.log, 
 SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972-url_pattern.patch, stacktraces.tar.gz


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4237) Implement index aliasing

2012-12-27 Thread Otis Gospodnetic (JIRA)
Otis Gospodnetic created SOLR-4237:
--

 Summary: Implement index aliasing
 Key: SOLR-4237
 URL: https://issues.apache.org/jira/browse/SOLR-4237
 Project: Solr
  Issue Type: New Feature
Reporter: Otis Gospodnetic
 Fix For: 4.2


This is handy for searching log indices and in all other situations where 
indices are added (and possibly deleted) over time.  Index aliasing allows one 
to map an arbitrary set of indices to an alias and avoid needing to change the 
search application to point it to new indices.

See http://search-lucene.com/m/YBn4w1UAbEB

It may also be worth thinking about using aliases when indexing.  This question 
comes up once in a while on the ElasticSearch mailing list for example.
See 
http://search-lucene.com/?q=index+time+aliasfc_project=ElasticSearchfc_type=mail+_hash_+user



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-12-27 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540001#comment-13540001
 ] 

Uwe Schindler commented on SOLR-1972:
-

Shawn: I reverted the commits related to this isseue (done by romseygeek) in 
trunk revision: 1426230, merged/reverted 4.x revision: 1426234

You have now a clean start again!

 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Shawn Heisey
Assignee: Alan Woodward
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-trunk.patch, elyograg-1972-trunk.patch, leak-closeable.patch, 
 leak.patch, revert-SOLR-1972.patch, SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-branch4x.patch, SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, solr1972-metricsregistry-branch4x-failure.log, 
 SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972-url_pattern.patch, stacktraces.tar.gz


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-12-27 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540004#comment-13540004
 ] 

Commit Tag Bot commented on SOLR-1972:
--

[branch_4x commit] Uwe Schindler
http://svn.apache.org/viewvc?view=revisionrevision=1426234

Merged revision(s) 1426230 from lucene/dev/trunk:
Revert SOLR-1972


 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Shawn Heisey
Assignee: Alan Woodward
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-trunk.patch, elyograg-1972-trunk.patch, leak-closeable.patch, 
 leak.patch, revert-SOLR-1972.patch, SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-branch4x.patch, SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, solr1972-metricsregistry-branch4x-failure.log, 
 SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972-url_pattern.patch, stacktraces.tar.gz


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4175) SearchComponent chain can't contain two components of the same class and use debugQuery

2012-12-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540019#comment-13540019
 ] 

Tomás Fernández Löbbe commented on SOLR-4175:
-

Any comments on this issue?

 SearchComponent chain can't contain two components of the same class and use 
 debugQuery
 ---

 Key: SOLR-4175
 URL: https://issues.apache.org/jira/browse/SOLR-4175
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: failure.patch, SOLR-4175.patch


 steps to reproduce the issue:
 1) Add two components of the same type to the components chain of the request 
 handler
 2) start solr with assertions enabled
 3) run a query to the request handler configured in 1 with debugQuery=true
 The query will throw a java.lang.AssertionError. I'll attach a test case to 
 reproduce the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2012-12-27 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540023#comment-13540023
 ] 

Chris Russell commented on SOLR-2894:
-

That is odd, it worked when I tested it on my box.  I will take another look.

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.1

 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1853) ReplicationHandler reports incorrect replication failures

2012-12-27 Thread Vadim Kirilchuk (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540025#comment-13540025
 ] 

Vadim Kirilchuk commented on SOLR-1853:
---

another 1 year ago =)

r929454 doesn't actually fix mentioned issue. However, seems like in 4.x there 
is no such issue.

 ReplicationHandler reports incorrect replication failures
 -

 Key: SOLR-1853
 URL: https://issues.apache.org/jira/browse/SOLR-1853
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: Linux
Reporter: Shawn Smith

 The ReplicationHandler details command reports that replication failed when 
 it didn't.  This occurs after a slave is restarted when it is already in sync 
 with the master.  This makes it difficult to write production monitors that 
 check the health of master-slave replication (no network issues, unexpected 
 slowdowns, etc).
 From the code, it looks like SnapPuller.successfulInstall starts out false 
 on restart.  If the slave starts out in sync with the master, then each no-op 
 replication poll leaves successfulInstall set to false which makes 
 SnapPuller.logReplicationTimeAndConfFiles log the poll as a failure.  
 SnapPuller.successfulInstall stays false until the first time replication 
 actually has to do something, at which point it gets set to true, and then 
 everything is OK.
 h4. Steps to reproduce
 # Setup Solr master and slave servers using Solr 1.4 Java replication.
 # Index some content on the master.  Wait for it to replicate through to the 
 slave so the master and slave are in sync.
 # Stop the slave server.
 # Restart the slave server.
 # Wait for the first slave replication poll.
 # Query the replication status using 
 http://localhost:8983/solr/replication?command=details;
 # Until the master index changes and there's something to replicate, all 
 slave replication polls after the restart will be shown as failed in the XML 
 response.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4175) SearchComponent chain can't contain two components of the same class and use debugQuery

2012-12-27 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540055#comment-13540055
 ] 

Erik Hatcher commented on SOLR-4175:


Tomas - looks like a good solution to me.  It's one-to-one for a component 
instance and its name, so this works out nicely.

One bit of test improvement could be to have MockSearchComponent write 
something to the response that the test picks up, something that is pulled from 
the config init, so that both the init params and separate instance cases are 
accounted for in the tests explicitly.

 SearchComponent chain can't contain two components of the same class and use 
 debugQuery
 ---

 Key: SOLR-4175
 URL: https://issues.apache.org/jira/browse/SOLR-4175
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: failure.patch, SOLR-4175.patch


 steps to reproduce the issue:
 1) Add two components of the same type to the components chain of the request 
 handler
 2) start solr with assertions enabled
 3) run a query to the request handler configured in 1 with debugQuery=true
 The query will throw a java.lang.AssertionError. I'll attach a test case to 
 reproduce the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4208) Refactor edismax query parser

2012-12-27 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540071#comment-13540071
 ] 

Erik Hatcher commented on SOLR-4208:


Tomas - regarding the idea of using Solr plugin points into the edismax parser 
- on second thought it might be premature to put a plugin system in there at 
this point.  Given the couple of examples you've added to the test cases, doing 
it with a subclass or an extension still requires Java coding and plugging in 
something, so again it's probably overkill to consider the plugin thing.  But 
here's how it'd work (using the HighlighterComponent, such as formatter and 
encoders) where a plugin was responsible for the multi-lingual field logic of 
overriding the configuration object in your example, or a different plugin that 
was used to plug in a response to getFieldQuery for your other example.  But I 
think leaving it how you've got it set up for subclassing works fine for now.  
+1

 Refactor edismax query parser
 -

 Key: SOLR-4208
 URL: https://issues.apache.org/jira/browse/SOLR-4208
 Project: Solr
  Issue Type: Improvement
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: qParserDiff.txt, SOLR-4208.patch, SOLR-4208.patch


 With successive changes, the edismax query parser has become more complex. It 
 would be nice to refactor it to reduce code complexity, also to allow better 
 extension and code reuse.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3413) CombiningFilter to recombine tokens into a single token for sorting

2012-12-27 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540073#comment-13540073
 ] 

Chris A. Mattmann commented on LUCENE-3413:
---

Hi Guys, there seems to be some interest on list for such a capability: 
http://lucene.472066.n3.nabble.com/Which-token-filter-can-combine-2-terms-into-1-td4028482.html
 (or at least sounds similar). Any interest from someone to work with me to 
commit this?

 CombiningFilter to recombine tokens into a single token for sorting
 ---

 Key: LUCENE-3413
 URL: https://issues.apache.org/jira/browse/LUCENE-3413
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 2.9.3
Reporter: Chris A. Mattmann
Priority: Minor
 Attachments: LUCENE-3413.Mattmann.090311.patch.txt, 
 LUCENE-3413.Mattmann.090511.patch.txt


 I whipped up this CombiningFilter for the following use case:
 I've got a bunch of titles of e.g., Books, such as:
 The Grapes of Wrath
 Tommy Tommerson saves the World
 Top of the World
 The Tales of Beedle the Bard
 Born Free
 etc.
 I want to sort these titles using a String field that includes stopword 
 analysis (e.g., to remove The), and synonym filtering (e.g., for grouping), 
 etc. I created an analysis chain in Solr for this that was based off of 
 *alphaOnlySort*, which looks like this:
 {code:xml}
 fieldType name=alphaOnlySort class=solr.TextField sortMissingLast=true 
 omitNorms=true
analyzer
 !-- KeywordTokenizer does no actual tokenizing, so the entire
  input string is preserved as a single token
   --
 tokenizer class=solr.KeywordTokenizerFactory/
 !-- The LowerCase TokenFilter does what you expect, which can be
  when you want your sorting to be case insensitive
   --
 filter class=solr.LowerCaseFilterFactory /
 !-- The TrimFilter removes any leading or trailing whitespace --
 filter class=solr.TrimFilterFactory /
 !-- The PatternReplaceFilter gives you the flexibility to use
  Java Regular expression to replace any sequence of characters
  matching a pattern with an arbitrary replacement string, 
  which may include back references to portions of the original
  string matched by the pattern.
  
  See the Java Regular Expression documentation for more
  information on pattern and replacement string syntax.
  
  
 http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
   --
 filter class=solr.PatternReplaceFilterFactory
 pattern=([^a-z]) replacement= replace=all
 / 
 /analyzer   
 /fieldType
 {code}
 The issue with alphaOnlySort is that it doesn't support stopword remove or 
 synonyms because those are based on the original token level instead of the 
 full strings produced by the KeywordTokenizer (which does not do 
 tokenization). I needed a filter that would allow me to change alphaOnlySort 
 and its analysis chain from using KeywordTokenizer to using 
 WhitespaceTokenizer, and then a way to recombine the tokens at the end. So, 
 take The Grapes of Wrath. I needed a way for it to get turned into:
 {noformat}
 grapes of wrath
 {noformat}
 And then to combine those tokens into a single token:
 {noformat}
 grapesofwrath
 {noformat}
 The attached CombiningFilter takes care of that. It doesn't do it super 
 efficiently I'm guessing (since I used a StringBuffer), but I'm open to 
 suggestions on how to make it better. 
 One other thing is that apparently this analyzer works fine for analysis 
 (e.g., it produces the desired tokens), however, for sorting in Solr I'm 
 getting null sort tokens. Need to figure out why. 
 Here ya go!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-MacOSX (32bit/jdk1.7.0) - Build # 1 - Failure!

2012-12-27 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-MacOSX/1/
Java: 32bit/jdk1.7.0 -server -XX:+UseG1GC

No tests ran.

Build Log:
[...truncated 7460 lines...]
FATAL: command execution failed
java.io.IOException: Cannot run program cmd (in directory 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX): error=2, No 
such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
at hudson.Proc$LocalProc.init(Proc.java:244)
at hudson.Proc$LocalProc.init(Proc.java:216)
at hudson.Launcher$LocalLauncher.launch(Launcher.java:763)
at hudson.Launcher$ProcStarter.start(Launcher.java:353)
at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:988)
at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:955)
at hudson.remoting.UserRequest.perform(UserRequest.java:118)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.init(UNIXProcess.java:135)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
... 15 more
Build step 'Execute Windows batch command' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 32bit/jdk1.7.0 -server -XX:+UseG1GC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.6.0) - Build # 2 - Still Failing!

2012-12-27 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-MacOSX/2/
Java: 64bit/jdk1.6.0 -XX:+UseSerialGC

No tests ran.

Build Log:
[...truncated 75 lines...]
BUILD FAILED
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/build.xml:353: 
The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/build.xml:39: 
The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build.xml:50:
 The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/common-build.xml:330:
 The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/common-build.xml:367:
 Ivy is not available

Total time: 3 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.6.0 -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-27 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540122#comment-13540122
 ] 

Mark Miller commented on SOLR-4114:
---

Ran into some troubles with the 'no two shards use the same index dir' check 
with a git checkout - finally traced it down to checking the source attribute 
from the the solrcore mbean - and when not using svn, this can be $URL. I'll 
look at an alternate check to do instead.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 5 - Still Failing!

2012-12-27 Thread Policeman Jenkins Server
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-MacOSX/5/
Java: 64bit/jdk1.7.0 -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 26169 lines...]
BUILD FAILED
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/build.xml:60: 
The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build.xml:242:
 The following error occurred while executing this line:
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/common-build.xml:1946:
 Execute failed: java.io.IOException: Cannot run program python3.2 (in 
directory 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene): 
error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
at java.lang.Runtime.exec(Runtime.java:615)
at 
org.apache.tools.ant.taskdefs.Execute$Java13CommandLauncher.exec(Execute.java:862)
at org.apache.tools.ant.taskdefs.Execute.launch(Execute.java:481)
at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:495)
at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:631)
at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:672)
at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:498)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.taskdefs.Sequential.execute(Sequential.java:68)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at 
org.apache.tools.ant.taskdefs.MacroInstance.execute(MacroInstance.java:398)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:390)
at org.apache.tools.ant.Target.performTasks(Target.java:411)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
at 
org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442)
at org.apache.tools.ant.taskdefs.SubAnt.execute(SubAnt.java:302)
at org.apache.tools.ant.taskdefs.SubAnt.execute(SubAnt.java:221)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
at org.apache.tools.ant.Target.execute(Target.java:390)
at org.apache.tools.ant.Target.performTasks(Target.java:411)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
at 
org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
at org.apache.tools.ant.Main.runBuild(Main.java:809)
at org.apache.tools.ant.Main.startAnt(Main.java:217)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.init(UNIXProcess.java:135)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)

RE: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 5 - Still Failing!

2012-12-27 Thread Uwe Schindler
Should be fixed now (problem was that /usr/local/bin was not in path when 
Jenkins starts the slave, because Jenkins startup does not eval /etc/profile).

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Policeman Jenkins Server [mailto:jenk...@sd-datasolutions.de]
 Sent: Thursday, December 27, 2012 8:59 PM
 To: dev@lucene.apache.org
 Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 5 -
 Still Failing!
 
 Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-MacOSX/5/
 Java: 64bit/jdk1.7.0 -XX:+UseConcMarkSweepGC
 
 All tests passed
 
 Build Log:
 [...truncated 26169 lines...]
 BUILD FAILED
 /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-
 MacOSX/build.xml:60: The following error occurred while executing this line:
 /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-
 MacOSX/lucene/build.xml:242: The following error occurred while executing
 this line:
 /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-
 MacOSX/lucene/common-build.xml:1946: Execute failed:
 java.io.IOException: Cannot run program python3.2 (in directory
 /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-
 MacOSX/lucene): error=2, No such file or directory
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
   at java.lang.Runtime.exec(Runtime.java:615)
   at
 org.apache.tools.ant.taskdefs.Execute$Java13CommandLauncher.exec(Exec
 ute.java:862)
   at org.apache.tools.ant.taskdefs.Execute.launch(Execute.java:481)
   at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:495)
   at
 org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:631)
   at
 org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:672)
   at
 org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:498)
   at
 org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
   at org.apache.tools.ant.Task.perform(Task.java:348)
   at
 org.apache.tools.ant.taskdefs.Sequential.execute(Sequential.java:68)
   at
 org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
   at org.apache.tools.ant.Task.perform(Task.java:348)
   at
 org.apache.tools.ant.taskdefs.MacroInstance.execute(MacroInstance.java:3
 98)
   at
 org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
   at org.apache.tools.ant.Task.perform(Task.java:348)
   at org.apache.tools.ant.Target.execute(Target.java:390)
   at org.apache.tools.ant.Target.performTasks(Target.java:411)
   at
 org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
   at
 org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleChe
 ckExecutor.java:38)
   at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
   at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442)
   at org.apache.tools.ant.taskdefs.SubAnt.execute(SubAnt.java:302)
   at org.apache.tools.ant.taskdefs.SubAnt.execute(SubAnt.java:221)
   at
 org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
   at org.apache.tools.ant.Task.perform(Task.java:348)
   at org.apache.tools.ant.Target.execute(Target.java:390)
   at org.apache.tools.ant.Target.performTasks(Target.java:411)
   at
 org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
   at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
   at
 org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecut
 or.java:41)
   at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
   at org.apache.tools.ant.Main.runBuild(Main.java:809)
   at 

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-27 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540154#comment-13540154
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1426329

SOLR-4114: tests: make happy with git - source attrib counts on svn substitution


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-27 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540160#comment-13540160
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1426330

SOLR-4114: tests: make happy with git - source attrib counts on svn substitution


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Faster GitHub mirroring

2012-12-27 Thread Mark Miller
For anyone out there using git to work with lucene-solr, just an fyi:

In the past I've always done my checkout from github and then added the apache 
github repo as an upstream repo - this was because github often lagged by hours 
to sometimes days or longer, but the apache repo was dirt slow.

I saw some emails go by on infra not too long ago and it looks like mirroring 
lag to github has been significantly improved (I think it might even mirror on 
every commit).

Anyhow, it's fast enough that I'm now able to use it exclusively based on some 
recent testing - just wanted to let anyone else that might benefit from this 
know of the change.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3345) BaseDistributedSearchTestCase should always ignore QTime

2012-12-27 Thread Chris Russell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540181#comment-13540181
 ] 

Chris Russell commented on SOLR-3345:
-

I was just trying to write a new unit test and I ran into this. 
junit.framework.AssertionFailedError: .responseHeader.QTime:40!=84
D:


 BaseDistributedSearchTestCase should always ignore QTime
 

 Key: SOLR-3345
 URL: https://issues.apache.org/jira/browse/SOLR-3345
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0-ALPHA
Reporter: Benson Margulies

 The existing subclasses of BaseDistributedSearchTestCase all skip QTime. I 
 can't see any way in which those numbers will ever match. Why not make this 
 the default, or only, behavior?
 (This is really a question, in that I will provide a patch if no one tells me 
 that it is a bad idea.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4170) Exception while creating snapshot

2012-12-27 Thread Marcin Rzewucki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540191#comment-13540191
 ] 

Marcin Rzewucki commented on SOLR-4170:
---

Tried with solr_41 (4.1.0.2012.12.14.09.42.29). It seems to be fixed only 
partially. Issue occurs once new index directory is created:
SEVERE: Exception while creating snapshot
java.io.FileNotFoundException: /solr/cores/aws/search/data/index/segments_1rg 
(No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(Unknown Source)
at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:223)
at org.apache.lucene.store.Directory.copy(Directory.java:200)
at 
org.apache.solr.handler.SnapShooter$FileCopier.copyFile(SnapShooter.java:205)
at 
org.apache.solr.handler.SnapShooter$FileCopier.copyFiles(SnapShooter.java:189)
at 
org.apache.solr.handler.SnapShooter.createSnapshot(SnapShooter.java:107)
at org.apache.solr.handler.SnapShooter$1.run(SnapShooter.java:77)

File segments_1rg exists, but in 
/solr/cores/aws/search/data/index.20121227160834580/ directory. Seems like 
index.properties/getNewIndexDir is not respected by snapshotting process. 
Could you check that, please ?

 Exception while creating snapshot
 -

 Key: SOLR-4170
 URL: https://issues.apache.org/jira/browse/SOLR-4170
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Marcin Rzewucki

 I'm using SolrCloud. When I'm trying to create index snapshot - exception 
 occurs:
 INFO: [test] webapp=/solr path=/replication 
 params={command=backupnumberToKeep=1} status=0 QTime=1
 Dec 07, 2012 6:00:02 PM org.apache.solr.handler.SnapShooter createSnapshot
 SEVERE: Exception while creating snapshot
 java.io.FileNotFoundException: File /solr/cores/test/data/index/segments_g 
 does not exist
 at 
 org.apache.solr.handler.SnapShooter$FileCopier.copyFile(SnapShooter.java:194)
 at 
 org.apache.solr.handler.SnapShooter$FileCopier.copyFiles(SnapShooter.java:185)
 at 
 org.apache.solr.handler.SnapShooter.createSnapshot(SnapShooter.java:105)
 at org.apache.solr.handler.SnapShooter$1.run(SnapShooter.java:78)
 Issue occurs randomly. Reloading core usually helps, but sometimes core has 
 to be reloaded couple of times to make snapshot possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Faster GitHub mirroring

2012-12-27 Thread David Smiley (@MITRE.org)
Thanks for the FYI!



-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Faster-GitHub-mirroring-tp4029346p4029356.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-12-27 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540200#comment-13540200
 ] 

Lance Norskog commented on SOLR-1972:
-

The 25/75 values come from weights, and can be changed to 99/95. I have a patch 
for that but never submitted it.

 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Shawn Heisey
Assignee: Alan Woodward
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-trunk.patch, elyograg-1972-trunk.patch, leak-closeable.patch, 
 leak.patch, revert-SOLR-1972.patch, SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-branch4x.patch, SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, solr1972-metricsregistry-branch4x-failure.log, 
 SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972-url_pattern.patch, stacktraces.tar.gz


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3413) CombiningFilter to recombine tokens into a single token for sorting

2012-12-27 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540206#comment-13540206
 ] 

Lance Norskog commented on LUCENE-3413:
---

For sorting, would you want 'grapes_of_wrath? This distinguishes the word 
'grapes' from words that might start with 'grapes'. (I don't know of any, but 
you see the problem :)

Also, in this use case numerical canonicalization makes sense for searching and 
sorting. Twenty-two - 22, and also 'twenty two' - 22. Or maybe 'twenty two' 
- 'twenty-two'.



 CombiningFilter to recombine tokens into a single token for sorting
 ---

 Key: LUCENE-3413
 URL: https://issues.apache.org/jira/browse/LUCENE-3413
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 2.9.3
Reporter: Chris A. Mattmann
Priority: Minor
 Attachments: LUCENE-3413.Mattmann.090311.patch.txt, 
 LUCENE-3413.Mattmann.090511.patch.txt


 I whipped up this CombiningFilter for the following use case:
 I've got a bunch of titles of e.g., Books, such as:
 The Grapes of Wrath
 Tommy Tommerson saves the World
 Top of the World
 The Tales of Beedle the Bard
 Born Free
 etc.
 I want to sort these titles using a String field that includes stopword 
 analysis (e.g., to remove The), and synonym filtering (e.g., for grouping), 
 etc. I created an analysis chain in Solr for this that was based off of 
 *alphaOnlySort*, which looks like this:
 {code:xml}
 fieldType name=alphaOnlySort class=solr.TextField sortMissingLast=true 
 omitNorms=true
analyzer
 !-- KeywordTokenizer does no actual tokenizing, so the entire
  input string is preserved as a single token
   --
 tokenizer class=solr.KeywordTokenizerFactory/
 !-- The LowerCase TokenFilter does what you expect, which can be
  when you want your sorting to be case insensitive
   --
 filter class=solr.LowerCaseFilterFactory /
 !-- The TrimFilter removes any leading or trailing whitespace --
 filter class=solr.TrimFilterFactory /
 !-- The PatternReplaceFilter gives you the flexibility to use
  Java Regular expression to replace any sequence of characters
  matching a pattern with an arbitrary replacement string, 
  which may include back references to portions of the original
  string matched by the pattern.
  
  See the Java Regular Expression documentation for more
  information on pattern and replacement string syntax.
  
  
 http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html
   --
 filter class=solr.PatternReplaceFilterFactory
 pattern=([^a-z]) replacement= replace=all
 / 
 /analyzer   
 /fieldType
 {code}
 The issue with alphaOnlySort is that it doesn't support stopword remove or 
 synonyms because those are based on the original token level instead of the 
 full strings produced by the KeywordTokenizer (which does not do 
 tokenization). I needed a filter that would allow me to change alphaOnlySort 
 and its analysis chain from using KeywordTokenizer to using 
 WhitespaceTokenizer, and then a way to recombine the tokens at the end. So, 
 take The Grapes of Wrath. I needed a way for it to get turned into:
 {noformat}
 grapes of wrath
 {noformat}
 And then to combine those tokens into a single token:
 {noformat}
 grapesofwrath
 {noformat}
 The attached CombiningFilter takes care of that. It doesn't do it super 
 efficiently I'm guessing (since I used a StringBuffer), but I'm open to 
 suggestions on how to make it better. 
 One other thing is that apparently this analyzer works fine for analysis 
 (e.g., it produces the desired tokens), however, for sorting in Solr I'm 
 getting null sort tokens. Need to figure out why. 
 Here ya go!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4016) Deduplication is broken by partial update

2012-12-27 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540209#comment-13540209
 ] 

Yonik Seeley commented on SOLR-4016:


If the SignatureUpdateProcessorFactory is generating the unique id, it must 
come before the distributed code
SOLR-2822 is a good starting place for related issues / discussions (I know 
there were more discussions but I can't find them now).


 Deduplication is broken by partial update
 -

 Key: SOLR-4016
 URL: https://issues.apache.org/jira/browse/SOLR-4016
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.0
 Environment: Tomcat6 / Catalina on Ubuntu 12.04 LTS
Reporter: Joel Nothman
Assignee: Shalin Shekhar Mangar
  Labels: 4.0.1_Candidate
 Fix For: 4.1, 5.0


 The SignatureUpdateProcessorFactory used (primarily?) for deduplication does 
 not consider partial update semantics.
 The below uses the following solrconfig.xml excerpt:
 {noformat}
  updateRequestProcessorChain name=text_hash
processor class=solr.processor.SignatureUpdateProcessorFactory
  bool name=enabledtrue/bool
  str name=signatureFieldtext_hash/str
  bool name=overwriteDupesfalse/bool
  str name=fieldstext/str
  str name=signatureClasssolr.processor.TextProfileSignature/str
/processor
processor class=solr.LogUpdateProcessorFactory /
processor class=solr.RunUpdateProcessorFactory /
  /updateRequestProcessorChain
 {noformat}
 Firstly, the processor treats {noformat}{set: value}{noformat} as a 
 string and hashes it, instead of the value alone:
 {noformat}
 $ curl '$URL/update?commit=true' -H 'Content-type:application/json' -d 
 '{add:{doc:{id: abcde, text: {set: hello world'  curl 
 '$URL/select?q=id:abcde'
 {responseHeader:{status:0,QTime:30}}
 ?xml version=1.0 encoding=UTF-8?responselst 
 name=responseHeaderint name=status0/intint name=QTime1/intlst 
 name=paramsstr name=qid:abcde/str/lst/lstresult name=response 
 numFound=1 start=0docstr name=idabcde/strstr name=texthello 
 world/strstr name=text_hashad48c7ad60ac22cc/strlong 
 name=_version_1417247434224959488/long/doc/result
 /response
 $
 $ curl '$URL/update?commit=true' -H 'Content-type:application/json' -d 
 '{add:{doc:{id: abcde, text: hello world}}}'  curl 
 '$URL/select?q=id:abcde'
 {responseHeader:{status:0,QTime:27}}
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint 
 name=QTime1/intlst name=paramsstr 
 name=qid:abcde/str/lst/lstresult name=response numFound=1 
 start=0docstr name=idabcde/strstr name=texthello 
 world/strstr name=text_hashb169c743d220da8d/strlong 
 name=_version_141724802221564/long/doc/result
 /response
 {noformat}
 Note the different text_hash value.
 Secondly, when updating a field other than those used to create the signature 
 (which I imagine is a more common use-case), the signature is recalculated 
 from no values:
 {noformat}
 $ curl '$URL/update?commit=true' -H 'Content-type:application/json' -d 
 '{add:{doc:{id: abcde, title: {set: new title'  curl 
 '$URL/select?q=id:abcde'
 {responseHeader:{status:0,QTime:39}}
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint 
 name=QTime1/intlst name=paramsstr 
 name=qid:abcde/str/lst/lstresult name=response numFound=1 
 start=0docstr name=idabcde/strstr name=texthello 
 world/strstr name=text_hash/strstr name=titlenew 
 title/strlong name=_version_1417248120480202752/long/doc/result
 /response
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-12-27 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540211#comment-13540211
 ] 

Shawn Heisey commented on SOLR-1972:


Lance, I see two other problems with OnlineSummarizer.

1) There are only 100 samples.  That's apparently enough for 0/25/50/75/100 
quartiles, but it seems very low for 95/99.  I was even worried about the 1024 
samples in the metrics library, but apparently it actually works very well.

2) Solr already includes mahout-math 0.3 as a dependency of carrot2.  
OnlineSummarizer was introduced in 0.4, and there are some serious bugs in it 
as late as 0.6.  Upgrading mahout-math to 0.7 causes clustering (carrot2) tests 
to fail because it can't find a mahout class.  Even the newest version of 
carrot2 depends specifically on mahout-math 0.3.

I have a checkout where I am re-applying the metrics patch and have made 
RequestHandlerBase implement Closeable, but I have zero knowledge about where 
the close() method would actually have to be called.  I have been attempting to 
figure it out, but Solr is a very large and complex codebase, so I will need 
some help.  I'm almost always idling in #lucene and #solr as 'elyograg' if 
someone wants to give me some live pointers.  I can do the actual work, but I 
need some help figuring out where handlers are created and destroyed.


 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Shawn Heisey
Assignee: Alan Woodward
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: elyograg-1972-3.2.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-trunk.patch, elyograg-1972-trunk.patch, leak-closeable.patch, 
 leak.patch, revert-SOLR-1972.patch, SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-branch4x.patch, SOLR-1972-branch4x.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, SOLR-1972_metrics.patch, 
 SOLR-1972_metrics.patch, solr1972-metricsregistry-branch4x-failure.log, 
 SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972-url_pattern.patch, stacktraces.tar.gz


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-3881) frequent OOM in LanguageIdentifierUpdateProcessor

2012-12-27 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-3881:
-

Assignee: Mark Miller

 frequent OOM in LanguageIdentifierUpdateProcessor
 -

 Key: SOLR-3881
 URL: https://issues.apache.org/jira/browse/SOLR-3881
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.0
 Environment: CentOS 6.x, JDK 1.6, (java -server -Xms2G -Xmx2G 
 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=)
Reporter: Rob Tulloh
Assignee: Mark Miller
 Attachments: SOLR-3881.patch


 We are seeing frequent failures from Solr causing it to OOM. Here is the 
 stack trace we observe when this happens:
 {noformat}
 Caused by: java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf(Arrays.java:2882)
 at 
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
 at 
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
 at java.lang.StringBuffer.append(StringBuffer.java:224)
 at 
 org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.concatFields(LanguageIdentifierUpdateProcessor.java:286)
 at 
 org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.process(LanguageIdentifierUpdateProcessor.java:189)
 at 
 org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:171)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler$2.update(BinaryUpdateRequestHandler.java:90)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:140)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:120)
 at 
 org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:105)
 at 
 org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186)
 at 
 org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:112)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:147)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler.parseAndLoadDocs(BinaryUpdateRequestHandler.java:100)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler.access$000(BinaryUpdateRequestHandler.java:47)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler$1.load(BinaryUpdateRequestHandler.java:58)
 at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
 at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
 at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
 at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
 at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
 at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
 at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
 at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
 at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3881) frequent OOM in LanguageIdentifierUpdateProcessor

2012-12-27 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3881:
--

Fix Version/s: 5.0
   4.1

 frequent OOM in LanguageIdentifierUpdateProcessor
 -

 Key: SOLR-3881
 URL: https://issues.apache.org/jira/browse/SOLR-3881
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.0
 Environment: CentOS 6.x, JDK 1.6, (java -server -Xms2G -Xmx2G 
 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=)
Reporter: Rob Tulloh
Assignee: Mark Miller
 Fix For: 4.1, 5.0

 Attachments: SOLR-3881.patch


 We are seeing frequent failures from Solr causing it to OOM. Here is the 
 stack trace we observe when this happens:
 {noformat}
 Caused by: java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf(Arrays.java:2882)
 at 
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
 at 
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
 at java.lang.StringBuffer.append(StringBuffer.java:224)
 at 
 org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.concatFields(LanguageIdentifierUpdateProcessor.java:286)
 at 
 org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.process(LanguageIdentifierUpdateProcessor.java:189)
 at 
 org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:171)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler$2.update(BinaryUpdateRequestHandler.java:90)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:140)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:120)
 at 
 org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:105)
 at 
 org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186)
 at 
 org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:112)
 at 
 org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:147)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler.parseAndLoadDocs(BinaryUpdateRequestHandler.java:100)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler.access$000(BinaryUpdateRequestHandler.java:47)
 at 
 org.apache.solr.handler.BinaryUpdateRequestHandler$1.load(BinaryUpdateRequestHandler.java:58)
 at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
 at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
 at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
 at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
 at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
 at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
 at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
 at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
 at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-4134) Cannot set multiple values into multivalued field with partial updates when using the standard RequestWriter.

2012-12-27 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley reopened SOLR-4134:



Reopening - I think this broke increment.

I added a test here:
http://svn.apache.org/viewvc?rev=1426373view=rev

The increment code ends up trying to parse [1] because the XML loader always 
represents extended info (like atomic updates) as a ListObject.  We should 
try to preserve as much information as possible and only use a list when there 
are multiple values (or if list syntax is actually used).

 Cannot set multiple values into multivalued field with partial updates when 
 using the standard RequestWriter.
 ---

 Key: SOLR-4134
 URL: https://issues.apache.org/jira/browse/SOLR-4134
 Project: Solr
  Issue Type: Bug
  Components: clients - java, update
Affects Versions: 4.0
Reporter: Will Butler
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 4.1

 Attachments: SOLR-4134.patch, SOLR-4134.patch


 I would like to set multiple values into a field using partial updates like 
 so:
 \\
 \\
 {code}
 ListString values = new ArrayListString();
 values.add(one);
 values.add(two);
 values.add(three);
 doc.setField(field, singletonMap(set, values));
 {code}
 When using the standard XML-based RequestWriter, you end up with a single 
 value that looks like [one, two, three], because of the toString() calls on 
 lines 130 and 132 of ClientUtils. It works properly when using the 
 BinaryRequestWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org