date:20140703

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4761/

All tests passed

Build Log:
[...truncated 59567 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:467:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:406:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:87:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:179:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* ./solr/example/solr/collection1/conf/_rest_managed.json

Total time: 138 minutes 24 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Sending artifact delta relative to Lucene-Solr-Tests-trunk-Java7 #4753
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 464 bytes
Compression is 0.0%
Took 23 ms
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5894) Speed up high-cardinality facets with sparse counters

2014-07-03 Thread Toke Eskildsen (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Toke Eskildsen updated SOLR-5894:
-

Attachment: SOLR-5894.patch

Changed the caching of counters to be field-specific, which works a lot better
for multi-field faceting requests. Introduced facet.sparse.maxtracked to limit
the maximum count for any facet value: This limits memory consumption at the
cost of accuracy.

Patch is for Solr 4.7.1.

Speed up high-cardinality facets with sparse counters
-

Key: SOLR-5894
URL: https://issues.apache.org/jira/browse/SOLR-5894
Project: Solr
Issue Type: Improvement
Components: SearchComponents - other
Affects Versions: 4.7.1
Reporter: Toke Eskildsen
Priority: Minor
Labels: faceted-search, faceting, memory, performance
Attachments: SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch,
SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch,
SOLR-5894.patch, SOLR-5894.patch, SOLR-5894_test.zip, SOLR-5894_test.zip,
SOLR-5894_test.zip, SOLR-5894_test.zip, SOLR-5894_test.zip,
author_7M_tags_1852_logged_queries_warmed.png,
sparse_200docs_fc_cutoff_20140403-145412.png,
sparse_500docs_20140331-151918_multi.png,
sparse_500docs_20140331-151918_single.png,
sparse_5051docs_20140328-152807.png

Field based faceting in Solr has two phases: Collecting counts for tags in
facets and extracting the requested tags.
The execution time for the collecting phase is approximately linear to the
number of hits and the number of references from hits to tags. This phase is
not the focus here.
The extraction time scales with the number of unique tags in the search
result, but is also heavily influenced by the total number of unique tags in
the facet as every counter, 0 or not, is visited by the extractor (at least
for count order). For fields with millions of unique tag values this means
10s of milliseconds added to the minimum response time (see
https://sbdevel.wordpress.com/2014/03/18/sparse-facet-counting-on-a-real-index/
for a test on a corpus with 7M unique values in the facet).
The extractor needs to visit every counter due to the current counter
structure being a plain int-array of size #unique_tags. Switching to a sparse
structure, where only the tag counters 0 are visited, makes the extraction
time linear to the number of unique tags in the result set.
Unfortunately the number of unique tags in the result set is unknown at
collect time, so it is not possible to reliably select sparse counting vs.
full counting up front. Luckily there exists solutions for sparse sets that
has the property of switching to non-sparse-mode without a switch-penalty,
when the sparse-threshold is exceeded (see
http://programmingpraxis.com/2012/03/09/sparse-sets/ for an example). This
JIRA aims to implement this functionality in Solr.
Current status: Sparse counting is implemented for field cache faceting, both
single- and multi-value, with and without doc-values. Sort by count only. The
patch applies cleanly to Solr 4.6.1 and should integrate well with everything
as all functionality is unchanged. After patching, the following new
parameters are possible:
* facet.sparse=true enables sparse faceting.
* facet.sparse.mintags=1 the minimum amount of unique tags in the given
field for sparse faceting to be active. This is used for auto-selecting
whether sparse should be used or not.
* facet.sparse.fraction=0.08 the overhead used for the sparse tracker.
Setting this too low means that only very small result sets are handled as
sparse. Setting this too high will result in a large performance penalty if
the result set blows the sparse tracker. Values between 0.04 and 0.1 seems to
work well.
* facet.sparse.packed=true use PackecInts for counters instead of int[]. This
saves memory, but performance will differ. Whether performance will be better
or worse depends on the corpus. Experiment with it.
* facet.sparse.cutoff=0.90 if the estimated number (based on hitcount) of
unique tags in the search result exceeds this fraction of the sparse tracker,
do not perform sparse tracking. The estimate is based on the assumption that
references from documents to tags are distributed randomly.
* facet.sparse.pool.size=2 the maximum amount of sparse trackers to clear and
keep in memory, ready for usage. Clearing and re-using a counter is faster
that allocating it fresh from the heap. Setting the pool size to 0 means than
a new sparse counter will be allocated each time, just as standard Solr
faceting works.
* facet.sparse.stats=true adds a special tag with timing statistics for
sparse faceting.
* facet.sparse.stats.reset=true resets the timing statistics and clears the

[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2020 - Still Failing

Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/2020/

All tests passed

Build Log:
[...truncated 29247 lines...]
check-licenses:
 [echo] License check under: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr
 [licenses] MISSING sha1 checksum file for: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/log4j-1.2.16.jar
 [licenses] EXPECTED sha1 checksum file : 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/log4j-1.2.16.jar.sha1

[...truncated 1 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:467:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:70:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/build.xml:254:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/lucene/tools/custom-tasks.xml:62:
 License check failed. Check the logs.

Total time: 139 minutes 14 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Sending artifact delta relative to Lucene-Solr-Tests-4.x-Java7 #2008
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 464 bytes
Compression is 0.0%
Took 19 ms
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Ksikes updated LUCENE-5795:


Attachment: LUCENE-5795

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader

2014-07-03 Thread Nicola Buso (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicola Buso updated LUCENE-5801:


Attachment: LUCENE-5801_2.patch

Sure Shai, this is the patch

LUCENE-5801_2.patch: reverted FacetsConfig, added FacetsConfig inner extension 
in OrdinalMappingAtomicReader

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
 Attachments: LUCENE-5801.patch, LUCENE-5801_1.patch, 
 LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Ksikes updated LUCENE-5795:


Attachment: LUCENE-5795

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Ksikes updated LUCENE-5795:


Attachment: (was: LUCENE-5795)

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Ksikes updated LUCENE-5795:


Attachment: LUCENE-5795

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Ksikes updated LUCENE-5795:


Attachment: (was: LUCENE-5795)

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_05) - Build # 10710 - Still Failing!

2014-07-03 Thread ASF subversion and git services (JIRA)

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10710/
Java: 32bit/jdk1.8.0_05 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 51542 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:406: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:181: 
Source checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/gimap-1.5.1.jar.sha1
* ./solr/licenses/greenmail-1.3.1b.jar.sha1
* ./solr/licenses/javax.mail-1.5.1.jar.sha1

Total time: 78 minutes 15 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 32bit/jdk1.8.0_05 -client 
-XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader

2014-07-03 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera reassigned LUCENE-5801:
--

Assignee: Shai Erera

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader

2014-07-03 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5801:
---

Attachment: LUCENE-5801.patch

Thanks Nicola. I updated the patch with a CHANGES entry. Also, I fixed the 
DocValues class to not call ordsReader.getReader() for every document. I will 
run tests and commit shortly.

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader


[ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051310#comment-14051310
 ] 

ASF subversion and git services commented on LUCENE-5801:
-

Commit 1607582 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1607582 ]

LUCENE-5801: add back OrdinalMappingAtomicReader

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader

2014-07-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051316#comment-14051316
 ] 

ASF subversion and git services commented on LUCENE-5801:
-

Commit 1607585 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1607585 ]

LUCENE-5801: move test under correct package

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader

2014-07-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051317#comment-14051317
 ] 

ASF subversion and git services commented on LUCENE-5801:
-

Commit 1607586 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1607586 ]

LUCENE-5801: add back OrdinalMappingAtomicReader

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader

2014-07-03 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5801.


   Resolution: Fixed
Fix Version/s: 4.10
   5.0
Lucene Fields: New,Patch Available  (was: New)

I moved the classes under the .taxonomy package, as it seemed silly to have a 
dedicated package for two classes. Committed to trunk and 4x.

Thanks Nicola!

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5474) Add stateFormat=2 support to CloudSolrServer

2014-07-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051327#comment-14051327
 ] 

ASF subversion and git services commented on SOLR-5474:
---

Commit 1607587 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1607587 ]

reverting SOLR-5473 , SOLR-5474

 Add  stateFormat=2 support to CloudSolrServer
 -

 Key: SOLR-5474
 URL: https://issues.apache.org/jira/browse/SOLR-5474
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5474.patch, SOLR-5474.patch, SOLR-5474.patch, 
 fail.logs


 In this mode SolrJ would not watch any ZK node
 It fetches the state  on demand and cache the most recently used n 
 collections in memory.
 SolrJ would not listen to any ZK node. When a request comes for a collection 
 ‘xcoll’
 it would first check if such a collection exists
 If yes it first looks up the details in the local cache for that collection
 If not found in cache , it fetches the node /collections/xcoll/state.json and 
 caches the information
 Any query/update will be sent with extra query param specifying the 
 collection name , version (example \_stateVer=xcoll:34) . A node would throw 
 an error (INVALID_NODE) if it does not have the right version
 If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
 fresh state information for that collection (and caches it again)
 If there is a connection timeout, SolrJ assumes the node is down and re-fetch 
 the state for the collection and try again



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-07-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051326#comment-14051326
 ] 

ASF subversion and git services commented on SOLR-5473:
---

Commit 1607587 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1607587 ]

reverting SOLR-5473 , SOLR-5474

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-configname-fix.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, 
 ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-07-03 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051330#comment-14051330
 ] 

Noble Paul commented on SOLR-5473:
--

I have reverted the changes

We are working on alternative approaches .


bq.There is no consensus IMO, not until I'm convinced by more voices that I'm 
smocking crack

Consensus cannot be reached without feedback and collaboration. I hope next 
time we will do it better. 


 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-configname-fix.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, 
 ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10711 - Still Failing!

2014-07-03 Thread ASF subversion and git services (JIRA)

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10711/
Java: 64bit/jdk1.8.0_20-ea-b15 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  
org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils

Error Message:
file _1.fdx was already written to

Stack Trace:
java.io.IOException: file _1.fdx was already written to
at 
__randomizedtesting.SeedInfo.seed([2AFF1A4359AFD7EB:C8D30CECD284E330]:0)
at 
org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:492)
at 
org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
at 
org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.init(CompressingStoredFieldsWriter.java:110)
at 
org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:120)
at 
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:327)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2590)
at 
org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils.merge(TaxonomyMergeUtils.java:56)
at 
org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils(OrdinalMappingReaderTest.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)

[jira] [Commented] (LUCENE-5793) Add equals/hashCode to FieldType


[ 
https://issues.apache.org/jira/browse/LUCENE-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051366#comment-14051366
 ] 

ASF subversion and git services commented on LUCENE-5793:
-

Commit 1607595 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1607595 ]

LUCENE-5793: add equals/hashCode to FieldType

 Add equals/hashCode to FieldType
 

 Key: LUCENE-5793
 URL: https://issues.apache.org/jira/browse/LUCENE-5793
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shay Banon
 Attachments: LUCENE-5793.patch


 would be nice to have equals and hashCode to FieldType, so one can easily 
 check if they are the same, and for example, reuse existing default 
 implementations of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5793) Add equals/hashCode to FieldType

2014-07-03 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5793.
-

   Resolution: Fixed
Fix Version/s: 4.10
   5.0

 Add equals/hashCode to FieldType
 

 Key: LUCENE-5793
 URL: https://issues.apache.org/jira/browse/LUCENE-5793
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shay Banon
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5793.patch


 would be nice to have equals and hashCode to FieldType, so one can easily 
 check if they are the same, and for example, reuse existing default 
 implementations of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5793) Add equals/hashCode to FieldType

2014-07-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051370#comment-14051370
 ] 

ASF subversion and git services commented on LUCENE-5793:
-

Commit 1607598 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1607598 ]

LUCENE-5793: add equals/hashCode to FieldType

 Add equals/hashCode to FieldType
 

 Key: LUCENE-5793
 URL: https://issues.apache.org/jira/browse/LUCENE-5793
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shay Banon
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5793.patch


 would be nice to have equals and hashCode to FieldType, so one can easily 
 check if they are the same, and for example, reuse existing default 
 implementations of it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051377#comment-14051377
 ] 

Mark Miller commented on SOLR-5473:
---

I'm personally frustrated as well. I feel like I've been giving feed back on 
this for a ton of months now and that it hasn't affected anything but variable 
names and minor cleanup.

I'm sorry recently was not a good time - I had vacation, work trips, and a lot 
of work and personal issues stacked on me. I only have the time I have to 
contribute. It feels like the time I put in early trying to get my concerns in 
before too much work was underway was ignored anyway though. This is exactly 
what was planned from day one that I had objections too, just slightly cleaned 
up.

I still think these API's need to be done right. I think such minimal effort 
has gone into that so far, that I don't feel very bad blocking this. All the 
work has gone into making it work (which is a big step, and I do think from 
that perspective, it's good stuff), and very little has gone into making the 
code and API changes sensible, or into expanding the tests in the areas that 
are being changed.

I also think this issue is severely mistitled, and there is not a lot of 
clarity on how you are changing cluster state caching and watchers and what 
pro's and con's that change has.

Like I said, a user (and these are user API's to explore the clusterstate too) 
or a developer will be lost trying to deal with this. I think it needs to be 
addressed in code, but even if that didn't happen, a ton more could certainly 
be addressed with comments and documentation that is not.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-configname-fix.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, 
 ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5473) Make one state.json per collection

2014-07-03 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5473:
-

Attachment: SOLR-5473-74_POC.patch

This is not a full patch. Consider it as just pseudo-code for an alternate 
approach. These are the core pieces of the changes to the APIs. I mean , at the 
API level there will be no further  changes than this POC

* Selective watches are still done
* ClusterState has no reference to ZkStateReader. 


I did not want to bog you down with the full patch nor do I want to put in a 
lot of effort going down the wrong path. Please let me know if you are fine 
with the approach. If yes, I shall give a full patch shortly



 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, 
 SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_60) - Build # 10592 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10592/
Java: 32bit/jdk1.7.0_60 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 52387 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:406: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:179: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* 
./lucene/facet/src/java/org/apache/lucene/facet/taxonomy/OrdinalMappingAtomicReader.java
* 
./lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.java
* 
./lucene/facet/src/test/org/apache/lucene/facet/taxonomy/OrdinalMappingReaderTest.java

Total time: 94 minutes 31 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 32bit/jdk1.7.0_60 -client 
-XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_60) - Build # 10592 - Still Failing!

2014-07-03 Thread Shalin Shekhar Mangar

I added the svn properties to these three files.


On Thu, Jul 3, 2014 at 6:59 PM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10592/
 Java: 32bit/jdk1.7.0_60 -client -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 52387 lines...]
 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:406: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:87: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:179:
 The following files are missing svn:eol-style (or binary svn:mime-type):
 *
 ./lucene/facet/src/java/org/apache/lucene/facet/taxonomy/OrdinalMappingAtomicReader.java
 *
 ./lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.java
 *
 ./lucene/facet/src/test/org/apache/lucene/facet/taxonomy/OrdinalMappingReaderTest.java

 Total time: 94 minutes 31 seconds
 Build step 'Invoke Ant' marked build as failure
 [description-setter] Description set: Java: 32bit/jdk1.7.0_60 -client
 -XX:+UseParallelGC
 Archiving artifacts
 Recording test results
 Email was triggered for: Failure - Any
 Sending email for trigger: Failure - Any




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Regards,
Shalin Shekhar Mangar.

[jira] [Updated] (SOLR-5656) Add SharedFS Failover option that allows surviving Solr instances to take over serving data for victim Solr instances.

[
https://issues.apache.org/jira/browse/SOLR-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mark Miller updated SOLR-5656:
--

Attachment: SOLR-5656.patch

Sorry, took me a while to get this patch up.

Here is a first patch for feed back. It's a git patch against trunk from a
couple days ago.

I'll add a new patch that's converted to svn trunk shortly.

I'll also comment shortly with more details on the patch.

Add SharedFS Failover option that allows surviving Solr instances to take
over serving data for victim Solr instances.
--

Key: SOLR-5656
URL: https://issues.apache.org/jira/browse/SOLR-5656
Project: Solr
Issue Type: New Feature
Reporter: Mark Miller
Assignee: Mark Miller
Attachments: SOLR-5656.patch

When using HDFS, the Overseer should have the ability to reassign the cores
from failed nodes to running nodes.
Given that the index and transaction logs are in hdfs, it's simple for
surviving hardware to take over serving cores for failed hardware.
There are some tricky issues around having the Overseer handle this for you,
but seems a simple first pass is not too difficult.
This will add another alternative to replicating both with hdfs and solr.
It shouldn't be specific to hdfs, and would be an option for any shared file
system Solr supports.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5656) Add autoAddReplicas feature for shared file systems.


 [ 
https://issues.apache.org/jira/browse/SOLR-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5656:
--

Summary: Add autoAddReplicas feature for shared file systems.  (was: Add 
SharedFS Failover option that allows surviving Solr instances to take over 
serving data for victim Solr instances.)

 Add autoAddReplicas feature for shared file systems.
 

 Key: SOLR-5656
 URL: https://issues.apache.org/jira/browse/SOLR-5656
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Assignee: Mark Miller
 Attachments: SOLR-5656.patch


 When using HDFS, the Overseer should have the ability to reassign the cores 
 from failed nodes to running nodes.
 Given that the index and transaction logs are in hdfs, it's simple for 
 surviving hardware to take over serving cores for failed hardware.
 There are some tricky issues around having the Overseer handle this for you, 
 but seems a simple first pass is not too difficult.
 This will add another alternative to replicating both with hdfs and solr.
 It shouldn't be specific to hdfs, and would be an option for any shared file 
 system Solr supports.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2245) MailEntityProcessor Update

2014-07-03 Thread Timothy Potter (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051457#comment-14051457
]

Timothy Potter commented on SOLR-2245:
--

Thanks for digging into this guys! I have no affinity towards GreenMail, other
than it was very easy to use from JUnit. Is CDDL 1.0 suitable? If so, this
MockJavaMail project looks promising too:
https://java.net/projects/mock-javamail

MailEntityProcessor Update
--

Key: SOLR-2245
URL: https://issues.apache.org/jira/browse/SOLR-2245
Project: Solr
Issue Type: Improvement
Components: contrib - DataImportHandler
Affects Versions: 1.4, 1.4.1
Reporter: Peter Sturge
Assignee: Timothy Potter
Priority: Minor
Fix For: 4.9, 5.0

Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch,
SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.zip

This patch addresses a number of issues in the MailEntityProcessor
contrib-extras module.
The changes are outlined here:
* Added an 'includeContent' entity attribute to allow specifying content to
be included independently of processing attachments
e.g. entity includeContent=true processAttachments=false . . . /
would include message content, but not attachment content
* Added a synonym called 'processAttachments', which is synonymous to the
mis-spelled (and singular) 'processAttachement' property. This property
functions the same as processAttachement. Default= 'true' - if either is
false, then attachments are not processed. Note that only one of these should
really be specified in a given entity tag.
* Added a FLAGS.NONE value, so that if an email has no flags (i.e. it is
unread, not deleted etc.), there is still a property value stored in the
'flags' field (the value is the string none)
Note: there is a potential backward compat issue with FLAGS.NONE for clients
that expect the absence of the 'flags' field to mean 'Not read'. I'm
calculating this would be extremely rare, and is inadviasable in any case as
user flags can be arbitrarily set, so fixing it up now will ensure future
client access will be consistent.
* The folder name of an email is now included as a field called 'folder'
(e.g. folder=INBOX.Sent). This is quite handy in search/post-indexing
processing
* The addPartToDocument() method that processes attachments is significantly
re-written, as there looked to be no real way the existing code would ever
actually process attachment content and add it to the row data
Tested on the 3.x trunk with a number of popular imap servers.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2245) MailEntityProcessor Update

[
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051460#comment-14051460
]

Mark Miller commented on SOLR-2245:
---

bq. Is CDDL 1.0 suitable?

It's suitable, but in the second tier in terms of desirability. Really only
matters if license is the deciding factor for some reason though. It's suitable
if it's correctly handled in NOTICES and such.

MailEntityProcessor Update
--

Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch,
SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.zip

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-2245) MailEntityProcessor Update

[
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051477#comment-14051477
]

Uwe Schindler edited comment on SOLR-2245 at 7/3/14 2:08 PM:
-

bq. Is CDDL 1.0 suitable?

CDDL 1.0 is fine. The limitation here is that it must be listed correctly in
the NOTICE.txt file. We have other JAR files with this license, the most common
one is servlet-api.jar, but there are also others. See the NOTICE.txt:

{noformat}
JavaMail API 1.4.1: https://glassfish.dev.java.net/javaee5/mail/
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

JavaBeans Activation Framework (JAF):
http://java.sun.com/products/javabeans/jaf/index.jsp
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Jersey Core: https://jersey.java.net/
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Servlet-api.jar and javax.servlet-*.jar are under the CDDL license, the original
source code for this can be found at http://www.eclipse.org/jetty/downloads.php
{noformat}

was (Author: thetaphi):
bq. Is CDDL 1.0 suitable?

{noformat}
JavaMail API 1.4.1: https://glassfish.dev.java.net/javaee5/mail/
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Jersey Core: https://jersey.java.net/
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)
{noformat}

MailEntityProcessor Update
--

Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch,
SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.zip

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2245) MailEntityProcessor Update

[
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051477#comment-14051477
]

Uwe Schindler commented on SOLR-2245:
-

bq. Is CDDL 1.0 suitable?

{noformat}
JavaMail API 1.4.1: https://glassfish.dev.java.net/javaee5/mail/
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Jersey Core: https://jersey.java.net/
License: Common Development and Distribution License (CDDL) v1.0
(https://glassfish.dev.java.net/public/CDDLv1.0.html)
{noformat}

MailEntityProcessor Update
--

Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch,
SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.zip

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5302) Analytics Component


[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051479#comment-14051479
 ] 

Yonik Seeley commented on SOLR-5302:


bq. We're thinking of pulling this out of 5.x and going with the analytics 
framework instead, but haven't quite reached consensus on that.

I didn't realize that... can you point me at the discussion?

 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Fix For: 5.0

 Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
 SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
 solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-2245) MailEntityProcessor Update


[ 
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051477#comment-14051477
 ] 

Uwe Schindler edited comment on SOLR-2245 at 7/3/14 2:12 PM:
-

bq. Is CDDL 1.0 suitable?

CDDL 1.0 is fine. The limitation here is that it must be listed correctly in 
the NOTICE.txt file. We have other JAR files with this license, the most common 
one is servlet-api.jar, but there are also others. See the NOTICE.txt:

{noformat}
JavaMail API 1.4.1: https://glassfish.dev.java.net/javaee5/mail/
License: Common Development and Distribution License (CDDL) v1.0 
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

JavaBeans Activation Framework (JAF): 
http://java.sun.com/products/javabeans/jaf/index.jsp
License: Common Development and Distribution License (CDDL) v1.0 
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Jersey Core: https://jersey.java.net/
License: Common Development and Distribution License (CDDL) v1.0 
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Servlet-api.jar and javax.servlet-*.jar are under the CDDL license, the original
source code for this can be found at http://www.eclipse.org/jetty/downloads.php
{noformat}

See also: http://www.apache.org/legal/3party.html (Category B: Reciprocal 
Licenses)


was (Author: thetaphi):
bq. Is CDDL 1.0 suitable?

CDDL 1.0 is fine. The limitation here is that it must be listed correctly in 
the NOTICE.txt file. We have other JAR files with this license, the most common 
one is servlet-api.jar, but there are also others. See the NOTICE.txt:

{noformat}
JavaMail API 1.4.1: https://glassfish.dev.java.net/javaee5/mail/
License: Common Development and Distribution License (CDDL) v1.0 
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

JavaBeans Activation Framework (JAF): 
http://java.sun.com/products/javabeans/jaf/index.jsp
License: Common Development and Distribution License (CDDL) v1.0 
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Jersey Core: https://jersey.java.net/
License: Common Development and Distribution License (CDDL) v1.0 
(https://glassfish.dev.java.net/public/CDDLv1.0.html)

Servlet-api.jar and javax.servlet-*.jar are under the CDDL license, the original
source code for this can be found at http://www.eclipse.org/jetty/downloads.php
{noformat}

 MailEntityProcessor Update
 --

 Key: SOLR-2245
 URL: https://issues.apache.org/jira/browse/SOLR-2245
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.4, 1.4.1
Reporter: Peter Sturge
Assignee: Timothy Potter
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch, 
 SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.zip


 This patch addresses a number of issues in the MailEntityProcessor 
 contrib-extras module.
 The changes are outlined here:
 * Added an 'includeContent' entity attribute to allow specifying content to 
 be included independently of processing attachments
  e.g. entity includeContent=true processAttachments=false . . . / 
 would include message content, but not attachment content
 * Added a synonym called 'processAttachments', which is synonymous to the 
 mis-spelled (and singular) 'processAttachement' property. This property 
 functions the same as processAttachement. Default= 'true' - if either is 
 false, then attachments are not processed. Note that only one of these should 
 really be specified in a given entity tag.
 * Added a FLAGS.NONE value, so that if an email has no flags (i.e. it is 
 unread, not deleted etc.), there is still a property value stored in the 
 'flags' field (the value is the string none)
 Note: there is a potential backward compat issue with FLAGS.NONE for clients 
 that expect the absence of the 'flags' field to mean 'Not read'. I'm 
 calculating this would be extremely rare, and is inadviasable in any case as 
 user flags can be arbitrarily set, so fixing it up now will ensure future 
 client access will be consistent.
 * The folder name of an email is now included as a field called 'folder' 
 (e.g. folder=INBOX.Sent). This is quite handy in search/post-indexing 
 processing
 * The addPartToDocument() method that processes attachments is significantly 
 re-written, as there looked to be no real way the existing code would ever 
 actually process attachment content and add it to the row data
 Tested on the 3.x trunk with a number of popular imap servers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.8.0_05) - Build # 4069 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/4069/
Java: 64bit/jdk1.8.0_05 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 44872 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\build\jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] Traceback (most recent call last):
 [exec]   File 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\dev-tools/scripts/checkJavaDocs.py,
 line 371, in module
 [exec] if checkPackageSummaries(sys.argv[1], level):
 [exec]   File 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\dev-tools/scripts/checkJavaDocs.py,
 line 351, in checkPackageSummaries
 [exec] if checkClassSummaries(fullPath):
 [exec]   File 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\dev-tools/scripts/checkJavaDocs.py,
 line 215, in checkClassSummaries
 [exec] missing.append((lastCaption, unEscapeURL(lastItem)))
 [exec]   File 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\dev-tools/scripts/checkJavaDocs.py,
 line 303, in unEscapeURL
 [exec] s = s.replace('%20', ' ')
 [exec] AttributeError: 'NoneType' object has no attribute 'replace'

BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:467: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:63: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\build.xml:212: 
The following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\build.xml:247: 
The following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\common-build.xml:2338:
 exec returned: 1

Total time: 128 minutes 15 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.8.0_05 
-XX:-UseCompressedOops -XX:+UseConcMarkSweepGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6199) SolrJ, using SolrInputDocument methods, requires entire document to be loaded into memory

2014-07-03 Thread Joseph Gresock (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051509#comment-14051509
 ] 

Joseph Gresock commented on SOLR-6199:
--

We would also enjoy this feature, per this discussion: 
http://lucene.472066.n3.nabble.com/Streaming-large-updates-with-SolrJ-td4144527.html

 SolrJ, using SolrInputDocument methods, requires entire document to be loaded 
 into memory
 -

 Key: SOLR-6199
 URL: https://issues.apache.org/jira/browse/SOLR-6199
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.7.3
Reporter: Karl Wright

 ManifoldCF has historically used Solr's extracting update handler for 
 transmitting binary documents to Solr.  Recently, we've included Tika 
 processing of binary documents, and wanted instead to send an (unlimited by 
 ManifoldCF) character stream as a primary content field to Solr instead.  
 Unfortunately, it appears that the SolrInputDocument metaphor for receiving 
 extracted content and metadata requires that all fields be completely 
 converted to String objects.  This will cause ManifoldCF to certainly run out 
 of memory at some point, when multiple ManifoldCF threads all try to convert 
 large documents to in-memory strings at the same time.
 I looked into what would be needed to add streaming support to UpdateRequest 
 and SolrInputDocument.  Basically, a legal option would be to set a field 
 value that would be a Reader or a Reader[].  It would be straightforward to 
 implement this, EXCEPT for the fact that SolrCloud apparently makes 
 UpdateRequest copies, and copying a Reader isn't going to work unless there's 
 a backing solid object somewhere.  Even then, I could have gotten this to 
 work by using a temporary file for large streams, but there's no signal from 
 SolrCloud when it is done with its copies of UpdateRequest, so there's no 
 place to free any backing storage.
 If anyone knows a good way to do non-extracting updates without loading 
 entire documents into memory, please let me know.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_20-ea-b15) - Build # 10712 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10712/
Java: 32bit/jdk1.8.0_20-ea-b15 -server -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 51495 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:406: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:181: 
Source checkout is dirty after running tests!!! Offending files:
* ./solr/licenses/gimap-1.5.1.jar.sha1
* ./solr/licenses/greenmail-1.3.1b.jar.sha1
* ./solr/licenses/javax.mail-1.5.1.jar.sha1

Total time: 69 minutes 46 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 32bit/jdk1.8.0_20-ea-b15 -server 
-XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_20-ea-b15) - Build # 10712 - Still Failing!

2014-07-03 Thread Shalin Shekhar Mangar

I have removed those sha1 files.


On Thu, Jul 3, 2014 at 8:10 PM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10712/
 Java: 32bit/jdk1.8.0_20-ea-b15 -server -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 51495 lines...]
 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:406: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:181:
 Source checkout is dirty after running tests!!! Offending files:
 * ./solr/licenses/gimap-1.5.1.jar.sha1
 * ./solr/licenses/greenmail-1.3.1b.jar.sha1
 * ./solr/licenses/javax.mail-1.5.1.jar.sha1

 Total time: 69 minutes 46 seconds
 Build step 'Invoke Ant' marked build as failure
 [description-setter] Description set: Java: 32bit/jdk1.8.0_20-ea-b15
 -server -XX:+UseParallelGC
 Archiving artifacts
 Recording test results
 Email was triggered for: Failure - Any
 Sending email for trigger: Failure - Any




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Regards,
Shalin Shekhar Mangar.

[jira] [Commented] (SOLR-5302) Analytics Component

2014-07-03 Thread Steven Bower (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051528#comment-14051528
 ] 

Steven Bower commented on SOLR-5302:


Making the types of expression the analyics framework supports distributed is 
hard period regardless of what framework.. (eg median, percentiles, etc..) 
unless you accept some error rate... can someone point me to the analytics 
framework that is being talked about..?

 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Fix For: 5.0

 Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
 SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
 solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-07-03 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051538#comment-14051538
 ] 

Shalin Shekhar Mangar commented on SOLR-5473:
-

I think that's better but I don't like:
{code}
/**This is not a public API. Only used by ZkController */
  public void removeZKWatch(final String coll){
synchronized (this){
  watchedCollections.remove(coll);
  clusterState = clusterState.copyWith(Collections.String, 
DocCollectionsingletonMap(coll, null));
}
  }
{code}

If it's not supposed to be public API then it shouldn't be public. I think we 
should change this to an event listener model so that watching/un-watching can 
be done automatically. [~markrmil...@gmail.com] - can you please take a look as 
well?

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, 
 SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10711 - Still Failing!

It reproduces ... something's fishy with IW.changeCount -- seems that after
you open an IW on an existing Directory, changeCount = 0. I will try to
reproduce this in a standalone test to verify if it's a general IW bug or
not.

Shai


On Thu, Jul 3, 2014 at 2:54 PM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10711/
 Java: 64bit/jdk1.8.0_20-ea-b15 -XX:+UseCompressedOops -XX:+UseG1GC

 1 tests failed.
 FAILED:
  
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils

 Error Message:
 file _1.fdx was already written to

 Stack Trace:
 java.io.IOException: file _1.fdx was already written to
 at
 __randomizedtesting.SeedInfo.seed([2AFF1A4359AFD7EB:C8D30CECD284E330]:0)
 at
 org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:492)
 at
 org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
 at
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.init(CompressingStoredFieldsWriter.java:110)
 at
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:120)
 at
 org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:327)
 at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
 at
 org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2590)
 at
 org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils.merge(TaxonomyMergeUtils.java:56)
 at
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils(OrdinalMappingReaderTest.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at

[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60) - Build # 10593 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10593/
Java: 64bit/jdk1.7.0_60 -XX:+UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 45269 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/facet/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.html
 [exec]   missing Constructors: TaxonomyMergeUtils()
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:63: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:212: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:247: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/common-build.xml:2338: 
exec returned: 1

Total time: 78 minutes 59 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.7.0_60 
-XX:+UseCompressedOops -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60) - Build # 10593 - Still Failing!

I'll handle - will make it an abstract class.

Shai


On Thu, Jul 3, 2014 at 7:01 PM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10593/
 Java: 64bit/jdk1.7.0_60 -XX:+UseCompressedOops -XX:+UseSerialGC

 All tests passed

 Build Log:
 [...truncated 45269 lines...]
 -documentation-lint:
  [echo] checking for broken html...
 [jtidy] Checking for broken html (such as invalid tags)...
[delete] Deleting directory
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/jtidy_tmp
  [echo] Checking for broken links...
  [exec]
  [exec] Crawl/parse...
  [exec]
  [exec] Verify...
  [echo] Checking for missing docs...
  [exec]
  [exec]
 build/docs/facet/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.html
  [exec]   missing Constructors: TaxonomyMergeUtils()
  [exec]
  [exec] Missing javadocs were found!

 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:63: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:212: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:247: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/common-build.xml:2338:
 exec returned: 1

 Total time: 78 minutes 59 seconds
 Build step 'Invoke Ant' marked build as failure
 [description-setter] Description set: Java: 64bit/jdk1.7.0_60
 -XX:+UseCompressedOops -XX:+UseSerialGC
 Archiving artifacts
 Recording test results
 Email was triggered for: Failure - Any
 Sending email for trigger: Failure - Any




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10711 - Still Failing!

I am still digging - this isn't related to IW.changeCount, but
SegInfos.counter.

I think there's a bug in RandomIndexWriter - when you call close(), it
randomly executes a forceMerge(), then closes, without committing.
In IW.close(), we only check for lost changes prior to 5.0, therefore we
don't hit the RuntimeException, and the changes made by forceMerge() are
rolled-back.
But at that point, _1.fdx was already written (even though it was deleted
by rollback()), therefore addIndexes cannot write it again.

This seems to be a bug in RIW, not in core code, and if I add w.commit()
after that randomForceMerge, the test passes.

I will review it again later and commit, perhaps someone can take a second
look.

Shai


On Thu, Jul 3, 2014 at 6:50 PM, Shai Erera sh...@apache.org wrote:

 It reproduces ... something's fishy with IW.changeCount -- seems that
 after you open an IW on an existing Directory, changeCount = 0. I will try
 to reproduce this in a standalone test to verify if it's a general IW bug
 or not.

 Shai


 On Thu, Jul 3, 2014 at 2:54 PM, Policeman Jenkins Server 
 jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10711/
 Java: 64bit/jdk1.8.0_20-ea-b15 -XX:+UseCompressedOops -XX:+UseG1GC

 1 tests failed.
 FAILED:
  
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils

 Error Message:
 file _1.fdx was already written to

 Stack Trace:
 java.io.IOException: file _1.fdx was already written to
 at
 __randomizedtesting.SeedInfo.seed([2AFF1A4359AFD7EB:C8D30CECD284E330]:0)
 at
 org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:492)
 at
 org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
 at
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.init(CompressingStoredFieldsWriter.java:110)
 at
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:120)
 at
 org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:327)
 at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
 at
 org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2590)
 at
 org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils.merge(TaxonomyMergeUtils.java:56)
 at
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils(OrdinalMappingReaderTest.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)

Single Field instance for both DocValues and indexed?

2014-07-03 Thread david.w.smi...@gmail.com

I was experimenting with having a user-provided/customized FieldType
for indexing code of (mostly) a set of numeric fields that are of a
common type.  The user/developer might want the type to both be
indexed  have docValues, or perhaps just one.  Or maybe stored
hypothetically for the purposes of this discussion.   Even though
Lucene’s FieldType allows you to configure both DocValues 
indexed=true, it appears impossible to provide a single Field instance
with both options; the constructors force an either-or situation.  Of
course I know I could add more fields depending on the options (for
example as seen in Solr’s FieldType); but I think it’s awkward.  It
*seems* that Lucene’s indexing guts (DefaultIndexingChain) are
agnostic of this.  Wouldn’t it be great if you could simply provide a
Field with a value and FieldType (with various options) and it’d just
work?  Higher up the stack (Solr and presumably ElasticSearch), there
are abstractions that basically make this possible, but why not at the
Lucene layer?

~ David

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

NOTICE: increased CWIKI permissions for committers on Solr Ref Guide

2014-07-03 Thread Chris Hostetter



FYI: Following some changes to the way CWIKI permissions work a few montsh 
back, and per concensus in a recent thread on the PMC list, i've revised 
the Space Permissions for the Solr Ref Guide so that now all committers 
(who have bothered to create CWIKI accounts) have full Space Admin 
permissions for the SOLR ref guide space.  (This is the permissions 
needed to add users, edit templates, change CSS, etc...).


Anyone with existing edit perms on the ref guide should now see a new 
Space Admin option under Browse Space -- please use it sparingly and 
try to avoid deleting the whole space by accident.



I'll be revising the ACLs in a few minutes to reflect the new policy 
moving forward...


https://cwiki.apache.org/confluence/display/solr/Internal+-+CWIKI+ACLs


-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10711 - Still Failing!

2014-07-03 Thread Michael McCandless

+1 to add the commit in RIW if it does an operation like forceMerge in close.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Jul 3, 2014 at 12:23 PM, Shai Erera sh...@apache.org wrote:
 I am still digging - this isn't related to IW.changeCount, but
 SegInfos.counter.

 I think there's a bug in RandomIndexWriter - when you call close(), it
 randomly executes a forceMerge(), then closes, without committing.
 In IW.close(), we only check for lost changes prior to 5.0, therefore we
 don't hit the RuntimeException, and the changes made by forceMerge() are
 rolled-back.
 But at that point, _1.fdx was already written (even though it was deleted by
 rollback()), therefore addIndexes cannot write it again.

 This seems to be a bug in RIW, not in core code, and if I add w.commit()
 after that randomForceMerge, the test passes.

 I will review it again later and commit, perhaps someone can take a second
 look.

 Shai


 On Thu, Jul 3, 2014 at 6:50 PM, Shai Erera sh...@apache.org wrote:

 It reproduces ... something's fishy with IW.changeCount -- seems that
 after you open an IW on an existing Directory, changeCount = 0. I will try
 to reproduce this in a standalone test to verify if it's a general IW bug or
 not.

 Shai


 On Thu, Jul 3, 2014 at 2:54 PM, Policeman Jenkins Server
 jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10711/
 Java: 64bit/jdk1.8.0_20-ea-b15 -XX:+UseCompressedOops -XX:+UseG1GC

 1 tests failed.
 FAILED:
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils

 Error Message:
 file _1.fdx was already written to

 Stack Trace:
 java.io.IOException: file _1.fdx was already written to
 at
 __randomizedtesting.SeedInfo.seed([2AFF1A4359AFD7EB:C8D30CECD284E330]:0)
 at
 org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:492)
 at
 org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
 at
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.init(CompressingStoredFieldsWriter.java:110)
 at
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:120)
 at
 org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:327)
 at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
 at
 org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2590)
 at
 org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils.merge(TaxonomyMergeUtils.java:56)
 at
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils(OrdinalMappingReaderTest.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)

[jira] [Updated] (SOLR-5656) Add autoAddReplicas feature for shared file systems.


 [ 
https://issues.apache.org/jira/browse/SOLR-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5656:
--

Attachment: SOLR-5656.patch

Here is an svn patch against trunk.

 Add autoAddReplicas feature for shared file systems.
 

 Key: SOLR-5656
 URL: https://issues.apache.org/jira/browse/SOLR-5656
 Project: Solr
  Issue Type: New Feature
Reporter: Mark Miller
Assignee: Mark Miller
 Attachments: SOLR-5656.patch, SOLR-5656.patch


 When using HDFS, the Overseer should have the ability to reassign the cores 
 from failed nodes to running nodes.
 Given that the index and transaction logs are in hdfs, it's simple for 
 surviving hardware to take over serving cores for failed hardware.
 There are some tricky issues around having the Overseer handle this for you, 
 but seems a simple first pass is not too difficult.
 This will add another alternative to replicating both with hdfs and solr.
 It shouldn't be specific to hdfs, and would be an option for any shared file 
 system Solr supports.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2245) MailEntityProcessor Update

2014-07-03 Thread Hoss Man (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051696#comment-14051696
]

Hoss Man commented on SOLR-2245:

FYI: Even though it looks like we are going to avoid greenmail, i filed
LEGAL-206 to track the impacts on other ASF projects.

MailEntityProcessor Update
--

Attachments: SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.patch,
SOLR-2245.patch, SOLR-2245.patch, SOLR-2245.zip

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Ksikes updated LUCENE-5795:


Attachment: LUCENE-5795

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5795) More Like This: ensures selection of best terms is indeed O(n)


[ 
https://issues.apache.org/jira/browse/LUCENE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051735#comment-14051735
 ] 

Alex Ksikes commented on LUCENE-5795:
-

Added test to check for top N.

 More Like This: ensures selection of best terms is indeed O(n)
 --

 Key: LUCENE-5795
 URL: https://issues.apache.org/jira/browse/LUCENE-5795
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Alex Ksikes
Priority: Minor
 Attachments: LUCENE-5795, LUCENE-5795, LUCENE-5795, LUCENE-5795






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5302) Analytics Component

2014-07-03 Thread Erick Erickson (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051764#comment-14051764
]

Erick Erickson commented on SOLR-5302:
--

bq: I didn't realize that... can you point me at the discussion?

I mis-stated that severely, my apologies. What I should have said is more along
the lines that I don't quite know what to do with back-porting the analytics
stuff to 4.x. Or whether we should. It's quite a bit of code, the interface is
complex, and it doesn't play nice in distributed mode. I believe there are
functions that simply won't work distributed. And maybe can't.

Then there's the pluggable analytics framework that's been recently added. I
really wonder whether the right thing to do long-term is to pull this out of 5x
and port as much as possible into the pluggable analytics framework piecemeal
as necessary, stealing as much as possible and supporting what can be supported
in distributed mode. That still leaves the question of what to do with
functions that are inherently difficult/impossible to support in sharded
environments...

See SOLR-5963 for some of the other discussion about whether to move this to a
contrib rather than have it be in the mainline code. My concern is that if we
move it to a contrib, it'll just be code that languishes, especially given the
distributed limitations. Would it just be better to use the pluggable
framework? It seems to me that the use-case for single-shard analytics is
becoming less compelling, but that may be a misperception on my part.

Don't want it to seem like there's any decision here, more like I don't want to
introduce this much code into the mainline tree if it doesn't have wide
applicability, and I think the lack of distributed support severely limits how
widely it applies.

That said, I'm not dogmatically opposed either. But I'd like some sense of what
others think about it.

Analytics Component
---

Key: SOLR-5302
URL: https://issues.apache.org/jira/browse/SOLR-5302
Project: Solr
Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
Fix For: 5.0

Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch,
SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf,
solr_analytics-2013.10.04-2.patch

This ticket is to track a replacement for the StatsComponent. The
AnalyticsComponent supports the following features:
* All functionality of StatsComponent (SOLR-4499)
* Field Faceting (SOLR-3435)
** Support for limit
** Sorting (bucket name or any stat in the bucket
** Support for offset
* Range Faceting
** Supports all options of standard range faceting
* Query Faceting (SOLR-2925)
* Ability to use overall/field facet statistics as input to range/query
faceting (ie calc min/max date and then facet over that range
* Support for more complex aggregate/mapping operations (SOLR-1622)
** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean,
median, percentiles
** Operations: negation, abs, add, multiply, divide, power, log, date math,
string reversal, string concat
** Easily pluggable framework to add additional operations
* New / cleaner output format
Outstanding Issues:
* Multi-value field support for stats (supported for faceting)
* Multi-shard support (may not be possible for some operations, eg median)

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5302) Analytics Component

2014-07-03 Thread Craig Shyjak (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051767#comment-14051767
 ] 

Craig Shyjak commented on SOLR-5302:


Thank you for your email. I am currently out of the office.  If the matter is 
urgent please contact Adam Sherry (ashe...@marketforce.com).

Craig Shyjak


 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Fix For: 5.0

 Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
 SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
 solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Deleted] (SOLR-5302) Analytics Component


 [ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-5302:
---

Comment: was deleted

(was: Thank you for your email. I am currently out of the office.  If the 
matter is urgent please contact Adam Sherry (ashe...@marketforce.com).

Craig Shyjak
)

 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Fix For: 5.0

 Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
 SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
 solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5302) Analytics Component

2014-07-03 Thread Steven Bower (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051799#comment-14051799
 ] 

Steven Bower commented on SOLR-5302:


I think moving to contrib is probably the right thing at this point...

 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Fix For: 5.0

 Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
 SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
 solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Single Field instance for both DocValues and indexed?

2014-07-03 Thread david.w.smi...@gmail.com

I overlooked a special constructor labelled “Expert” and discovered it
is possible… though I had to override numericValue which seems quite
hacky:

  private static class ComboField extends Field {
private ComboField(String name, Object value, FieldType type) {
  super(name, type);//this expert constructor allows us to have a
field that has docValues  indexed
  super.fieldsData = value;
}

//Is this a hack?  We assume that numericValue() is only called
for DocValues purposes.
@Override
public Number numericValue() {
  if (fieldType().numericType() == FieldType.NumericType.DOUBLE)
return Double.doubleToLongBits(super.numericValue().doubleValue());
  //TODO others
  throw new IllegalStateException(unsupported type:
+fieldType().numericType());
}
  }

Why isn’t supporting a single Field with DocValues  indexed, etc.
supported more officially?

Any way, I’ll go with this for now.  FYI this very class is going to
show up in spatial BBoxStrategy in a new patch soon.

~ David


On Thu, Jul 3, 2014 at 12:48 PM, david.w.smi...@gmail.com
david.w.smi...@gmail.com wrote:
 I was experimenting with having a user-provided/customized FieldType
 for indexing code of (mostly) a set of numeric fields that are of a
 common type.  The user/developer might want the type to both be
 indexed  have docValues, or perhaps just one.  Or maybe stored
 hypothetically for the purposes of this discussion.   Even though
 Lucene’s FieldType allows you to configure both DocValues 
 indexed=true, it appears impossible to provide a single Field instance
 with both options; the constructors force an either-or situation.  Of
 course I know I could add more fields depending on the options (for
 example as seen in Solr’s FieldType); but I think it’s awkward.  It
 *seems* that Lucene’s indexing guts (DefaultIndexingChain) are
 agnostic of this.  Wouldn’t it be great if you could simply provide a
 Field with a value and FieldType (with various options) and it’d just
 work?  Higher up the stack (Solr and presumably ElasticSearch), there
 are abstractions that basically make this possible, but why not at the
 Lucene layer?

 ~ David

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5302) Analytics Component


[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051809#comment-14051809
 ] 

David Smiley commented on SOLR-5302:


bq. I think moving to contrib is probably the right thing at this point...
+1

 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Fix For: 5.0

 Attachments: SOLR-5302.patch, SOLR-5302.patch, SOLR-5302.patch, 
 SOLR-5302.patch, Search Analytics Component.pdf, Statistical Expressions.pdf, 
 solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5963) Finalize interface and backport analytics component to 4x

2014-07-03 Thread Houston Putman (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051811#comment-14051811
 ] 

Houston Putman commented on SOLR-5963:
--

[~hossman], sorry I just found this issue. I can address your concern about the 
statistical expressions.

* I have no issue changing the name of Expressions to something that doesn't 
conflict with existing lucene/solr stuff. I named it that because that is what 
they are, mathematical expressions.
* The difference between the three can be confusing when you first look at it, 
but I think the difference between Aggregates and the other two are pretty 
self-explanatory when you start to actually use them. First of all Aggregates 
aggregate unspecified amounts of data to a single value (such as median, 
average, standard deviation, etc.) these are statistics tools. The other two 
are a way of mapping(transforming/combining) one or more pieces of data into 
one piece of data (this piece of data may be 0 or 1 dimensional, so a single 
value or an array), these are regular mathematical tools like add, subtract, 
multiply, etc. So you can use Aggregate Mapping Operations/Expressions and 
Field Mapping Operations in the exact same way without thinking about it, the 
only difference between the two is that one maps multiple lists into one list 
and the other  maps multiple values into one value. Actually after typing this 
up I agree that the documentation of the feature could be significantly 
improved, but I am not sure a syntactical difference between aggregates and the 
other two are necessary since they don't share much functionality (really just 
sum and add). Also the naming of Aggregate Mapping Operations/Expressions 
and Field Mapping Operations should definitely be changed.
* Ok, I'm confused about this point. Field Mapping Operations are ValueSource 
parsers... I used existing ones and added some of my own. 

 Finalize interface and backport analytics component to 4x
 -

 Key: SOLR-5963
 URL: https://issues.apache.org/jira/browse/SOLR-5963
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.9, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
 Attachments: SOLR-5963.patch, SOLR-5963.patch


 Now that we seem to have fixed up the test failures for trunk for the 
 analytics component, we need to solidify the API and back-port it to 4x. For 
 history, see SOLR-5302 and SOLR-5488.
 As far as I know, these are the merges that need to occur to do this (plus 
 any that this JIRA brings up)
 svn merge -c 1543651 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545009 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545053 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545054 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545080 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545143 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545417 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545514 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1545650 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1546074 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1546263 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1559770 https://svn.apache.org/repos/asf/lucene/dev/trunk
 svn merge -c 1583636 https://svn.apache.org/repos/asf/lucene/dev/trunk
 The only remaining thing I think needs to be done is to solidify the 
 interface, see comments from [~yo...@apache.org] on the two JIRAs mentioned, 
 although SOLR-5488 is the most relevant one.
 [~sbower], [~houstonputman] and [~yo...@apache.org] might be particularly 
 interested here.
 I really want to put this to bed, so if we can get agreement on this soon I 
 can make it march.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1652 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1652/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 45114 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/facet/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.html
 [exec]   missing Constructors: TaxonomyMergeUtils()
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/build.xml:467: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/build.xml:63: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build.xml:212: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build.xml:247: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/common-build.xml:2338: 
exec returned: 1

Total time: 159 minutes 10 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.7.0 
-XX:-UseCompressedOops -XX:+UseG1GC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5963) Finalize interface and backport analytics component to 4x

2014-07-03 Thread Houston Putman (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051811#comment-14051811
]

Houston Putman edited comment on SOLR-5963 at 7/3/14 7:45 PM:
--

[~hossman], sorry I just found this issue. I can address your concern about the
statistical expressions.

* I have no issue changing the name of Expressions to something that doesn't
conflict with existing lucene/solr stuff. I named it that because that is what
they are, mathematical expressions.
* The difference between the three can be confusing when you first look at it,
but I think the difference between Aggregates and the other two are pretty
self-explanatory when you start to actually use them. First of all Aggregates
aggregate unspecified amounts of data to a single value (such as median,
average, standard deviation, etc.) these are statistics tools. The other two
are a way of mapping(transforming/combining) one or more pieces of data into
one piece of data (this piece of data may be 0 or 1 dimensional, so a single
value or an array), these are regular mathematical tools like add, subtract,
multiply, etc. So you can use Aggregate Mapping Operations/Expressions and
Field Mapping Operations in the exact same way without thinking about it, the
only difference between the two is that one maps multiple lists into one list
and the other maps multiple values into one value.
Actually after typing this up I agree that the documentation of the feature
could be significantly improved, but I am not sure a syntactical difference
between aggregates and the other two are necessary since they don't share much
functionality (really just sum and add). Also the naming of Aggregate
Mapping Operations/Expressions and Field Mapping Operations should
definitely be changed.
* Ok, I'm confused about this point. Field Mapping Operations are ValueSource
parsers... I used existing ones and added some of my own.

was (Author: houstonputman):
[~hossman], sorry I just found this issue. I can address your concern about the
statistical expressions.

* I have no issue changing the name of Expressions to something that doesn't
conflict with existing lucene/solr stuff. I named it that because that is what
they are, mathematical expressions.
* The difference between the three can be confusing when you first look at it,
but I think the difference between Aggregates and the other two are pretty
self-explanatory when you start to actually use them. First of all Aggregates
aggregate unspecified amounts of data to a single value (such as median,
average, standard deviation, etc.) these are statistics tools. The other two
are a way of mapping(transforming/combining) one or more pieces of data into
one piece of data (this piece of data may be 0 or 1 dimensional, so a single
value or an array), these are regular mathematical tools like add, subtract,
multiply, etc. So you can use Aggregate Mapping Operations/Expressions and
Field Mapping Operations in the exact same way without thinking about it, the
only difference between the two is that one maps multiple lists into one list
and the other maps multiple values into one value. Actually after typing this
up I agree that the documentation of the feature could be significantly
improved, but I am not sure a syntactical difference between aggregates and the
other two are necessary since they don't share much functionality (really just
sum and add). Also the naming of Aggregate Mapping Operations/Expressions
and Field Mapping Operations should definitely be changed.
* Ok, I'm confused about this point. Field Mapping Operations are ValueSource
parsers... I used existing ones and added some of my own.

Finalize interface and backport analytics component to 4x
-

Key: SOLR-5963
URL: https://issues.apache.org/jira/browse/SOLR-5963
Project: Solr
Issue Type: Improvement
Affects Versions: 4.9, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
Attachments: SOLR-5963.patch, SOLR-5963.patch

Now that we seem to have fixed up the test failures for trunk for the
analytics component, we need to solidify the API and back-port it to 4x. For
history, see SOLR-5302 and SOLR-5488.
As far as I know, these are the merges that need to occur to do this (plus
any that this JIRA brings up)
svn merge -c 1543651 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545009 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545053 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545054 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545080 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn merge -c 1545143 https://svn.apache.org/repos/asf/lucene/dev/trunk
svn

[jira] [Resolved] (SOLR-2602) It would be great if the Solr site referred to ManifoldCF as a related product

2014-07-03 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett resolved SOLR-2602.
-

Resolution: Fixed

Looking at the site today, this was done at some point in the past. Apache 
Manifold CF is listed under Related Projects at 
https://lucene.apache.org/solr/index.html (and with the updated URL).

 It would be great if the Solr site referred to ManifoldCF as a related product
 --

 Key: SOLR-2602
 URL: https://issues.apache.org/jira/browse/SOLR-2602
 Project: Solr
  Issue Type: Improvement
  Components: documentation
Reporter: Karl Wright
Priority: Minor
 Attachments: SOLR-2602.patch


 The Related products section of the Solr site has just Lucene and Nutch in 
 it.  It would be appropriate to have a link for ManifoldCF as well.  Url 
 would be: http://incubator.apache.org/connectors/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3198) Apache Solr to adhere to Apache Project Branding Requirements

2014-07-03 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-3198.
--

   Resolution: Fixed
Fix Version/s: (was: 4.9)
   (was: 5.0)
 Assignee: Steve Rowe

I fixed branding issues on the Lucene and Solr sites over a year ago.

 Apache Solr to adhere to Apache Project Branding Requirements 
 --

 Key: SOLR-3198
 URL: https://issues.apache.org/jira/browse/SOLR-3198
 Project: Solr
  Issue Type: New Feature
  Components: documentation
Reporter: Lewis John McGibbney
Assignee: Steve Rowe
 Attachments: SOLR-3198.patch, Solr_tm.png


 The ASF project branding requirements [0] provide guidelines for projects to 
 follow and adhre to.
 This is a trivial task, so I'll patch the site and upload it. 
 [0] http://www.apache.org/foundation/marks/pmcs.html



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-3198) Apache Solr to adhere to Apache Project Branding Requirements

2014-07-03 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe closed SOLR-3198.



 Apache Solr to adhere to Apache Project Branding Requirements 
 --

 Key: SOLR-3198
 URL: https://issues.apache.org/jira/browse/SOLR-3198
 Project: Solr
  Issue Type: New Feature
  Components: documentation
Reporter: Lewis John McGibbney
Assignee: Steve Rowe
 Attachments: SOLR-3198.patch, Solr_tm.png


 The ASF project branding requirements [0] provide guidelines for projects to 
 follow and adhre to.
 This is a trivial task, so I'll patch the site and upload it. 
 [0] http://www.apache.org/foundation/marks/pmcs.html



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Doc for round robin queries for SolrJ

2014-07-03 Thread Jack Krupansky

I wanted to read up on round robin queries using SolrJ, but I found nothing in 
the Solr reference guide.

Some needs:

1. No doc for LBHttpSolrServer. It has a wiki page and Javadoc though.
2. No doc for CloudSolrServer, but a few references. It has very minimal 
Javadoc though.
3. No general discussion or examples for round robin and SolrJ load balancing 
in general.

See:
http://wiki.apache.org/solr/LBHttpSolrServer
http://lucene.apache.org/solr/4_9_0/solr-solrj/org/apache/solr/client/solrj/impl/LBHttpSolrServer.html
http://lucene.apache.org/solr/4_9_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrServer.html

-- Jack Krupansky

[jira] [Commented] (LUCENE-3451) Remove special handling of pure negative Filters in BooleanFilter, disallow pure negative queries in BooleanQuery

2014-07-03 Thread Jack Krupansky (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051934#comment-14051934
]

Jack Krupansky commented on LUCENE-3451:

[~yo...@apache.org] says:

bq. The current handling of boolean queries with only prohibited clauses is not
a bug, but working as designed, so this issue is about changing that behavior.
Currently working applications will now start unexpectedly throwing
exceptions... now that's trappy.

The fact that a pure negative query, actually a sub-query within parentheses in
the query parser, returns zero documents has been a MAJOR problem for Solr
users. I've lost count how many times it has come up on the user list and we
tell users to work around the problem by manually inserting \*:\* after the
left parenthesis.

But I am interested in hearing why it is believed that it is working as
designed and whether there are really applications that would intentionally
write a list of negative clauses when the design is that they will simply be
ignored and match no documents. If that kind of compatibility is really needed,
I would say it can be accommodated with a config setting, rather than give
unexpected and bad behavior for so many other people with the current behavior.

I would prefer to see a fix the problem by having BQ do the right thing by
implicitly starting with a MatchAllDocsQuery if only MUST_NOT clauses are
present, but... if that is not possible, an exception would be much better.

Alternatively, given the difficulty of doing almost anything with the various
query parsers, the method that generates the BQ for the query parser
(QueryParserBase .getBooleanQuery) should just check for pure negative clauses
and then add the MADQ. If this is massively controversial, just add a config
option to disable it.

Remove special handling of pure negative Filters in BooleanFilter, disallow
pure negative queries in BooleanQuery
-

Key: LUCENE-3451
URL: https://issues.apache.org/jira/browse/LUCENE-3451
Project: Lucene - Core
Issue Type: Improvement
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Fix For: 4.9, 5.0

Attachments: LUCENE-3451.patch, LUCENE-3451.patch, LUCENE-3451.patch,
LUCENE-3451.patch, LUCENE-3451.patch

We should at least in Lucene 4.0 remove the hack in BooleanFilter that allows
pure negative Filter clauses. This is not supported by BooleanQuery and
confuses users (I think that's the problem in LUCENE-3450).
The hack is buggy, as it does not respect deleted documents and returns them
in its DocIdSet.
Also we should think about disallowing pure-negative Queries at all and throw
UOE.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3451) Remove special handling of pure negative Filters in BooleanFilter, disallow pure negative queries in BooleanQuery


[ 
https://issues.apache.org/jira/browse/LUCENE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051953#comment-14051953
 ] 

Yonik Seeley commented on LUCENE-3451:
--

bq. But I am interested in hearing why it is believed that it is working as 
designed and whether there are really applications that would intentionally 
write a list of negative clauses

Machine generated queries (including those from our own query parsers).
For example, (a -x) reduces to (-x) if a is a stopword.  Inserting *:* when a 
boolean query contains only negative clauses was vetoed in LUCENE-3460.

 Remove special handling of pure negative Filters in BooleanFilter, disallow 
 pure negative queries in BooleanQuery
 -

 Key: LUCENE-3451
 URL: https://issues.apache.org/jira/browse/LUCENE-3451
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-3451.patch, LUCENE-3451.patch, LUCENE-3451.patch, 
 LUCENE-3451.patch, LUCENE-3451.patch


 We should at least in Lucene 4.0 remove the hack in BooleanFilter that allows 
 pure negative Filter clauses. This is not supported by BooleanQuery and 
 confuses users (I think that's the problem in LUCENE-3450).
 The hack is buggy, as it does not respect deleted documents and returns them 
 in its DocIdSet.
 Also we should think about disallowing pure-negative Queries at all and throw 
 UOE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-3451) Remove special handling of pure negative Filters in BooleanFilter, disallow pure negative queries in BooleanQuery

[
https://issues.apache.org/jira/browse/LUCENE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051953#comment-14051953
]

Yonik Seeley edited comment on LUCENE-3451 at 7/3/14 9:55 PM:
--

bq. But I am interested in hearing why it is believed that it is working as
designed and whether there are really applications that would intentionally
write a list of negative clauses

Machine generated queries (including those from our own query parsers).
For example, (a -x) reduces to (-x) if a is a stopword. Inserting \*:\* when
a boolean query contains only negative clauses was vetoed in LUCENE-3460.

was (Author: ysee...@gmail.com):
bq. But I am interested in hearing why it is believed that it is working as
designed and whether there are really applications that would intentionally
write a list of negative clauses

Machine generated queries (including those from our own query parsers).
For example, (a -x) reduces to (-x) if a is a stopword. Inserting *:* when a
boolean query contains only negative clauses was vetoed in LUCENE-3460.

Remove special handling of pure negative Filters in BooleanFilter, disallow
pure negative queries in BooleanQuery
-

Key: LUCENE-3451
URL: https://issues.apache.org/jira/browse/LUCENE-3451
Project: Lucene - Core
Issue Type: Improvement
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Fix For: 4.9, 5.0

Attachments: LUCENE-3451.patch, LUCENE-3451.patch, LUCENE-3451.patch,
LUCENE-3451.patch, LUCENE-3451.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5803) Add another AnalyzerWrapper class that does not have its own cache, so delegate-only wrappers don't create thread local resources several times

Uwe Schindler created LUCENE-5803:
-

 Summary: Add another AnalyzerWrapper class that does not have its 
own cache, so delegate-only wrappers don't create thread local resources 
several times
 Key: LUCENE-5803
 URL: https://issues.apache.org/jira/browse/LUCENE-5803
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.10


This is a followup issue for the following Elasticsearch issue: 
https://github.com/elasticsearch/elasticsearch/pull/6714

Basically the problem is the following:
- Elasticsearch has a pool of Analyzers that are used for analysis in several 
indexes
- Each index uses a different PerFieldAnalyzerWrapper

PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it 
caches the tokenstreams for every field. If there are many fields, this are a 
lot. In addition, the underlying analyzers may also cache tokenstreams and 
other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer can 
always return the same components.

We should add similar code to Elasticsearch's directly to Lucene: If the 
delegating Analyzer just delegates per Field or just wraps CharFilters around 
the Reader, there is no need to cache the TokenStreamComponents a second time 
in the delegating Analyzers. This is only needed, if the delegating Analyzers 
adds additional TokenFilters (like ShingleAnalyzerWrapper).

We should name this new class DelegatingAnalyzerWrapper extends 
AnalyzerWrapper. The wrapComponents method must be final, because we are not 
allowed to add additional TokenFilters, but unlike ES, we don't need to 
disallow wrapping with CharFilters.

Internally this class uses a private ReuseStrategy that just delegates to the 
underlying analyzer. It does not matter here if the strategy of the delegate is 
global or per field, this is private to the delegate.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-3451) Remove special handling of pure negative Filters in BooleanFilter, disallow pure negative queries in BooleanQuery

[
https://issues.apache.org/jira/browse/LUCENE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051953#comment-14051953
]

Yonik Seeley edited comment on LUCENE-3451 at 7/3/14 10:00 PM:
---

bq. But I am interested in hearing why it is believed that it is working as
designed

Working as designed means just that... not that it's optimal, but that it is
working the way the original author intended. FWIW, I was really only against
throwing an exception. I personally think it would be fine to insert \*:\* for
the user where appropriate.

bq. and whether there are really applications that would intentionally write a
list of negative clauses

Remove special handling of pure negative Filters in BooleanFilter, disallow
pure negative queries in BooleanQuery
-

Key: LUCENE-3451
URL: https://issues.apache.org/jira/browse/LUCENE-3451
Project: Lucene - Core
Issue Type: Improvement
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Fix For: 4.9, 5.0

Attachments: LUCENE-3451.patch, LUCENE-3451.patch, LUCENE-3451.patch,
LUCENE-3451.patch, LUCENE-3451.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5803) Add another AnalyzerWrapper class that does not have its own cache, so delegate-only wrappers don't create thread local resources several times


[ 
https://issues.apache.org/jira/browse/LUCENE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051964#comment-14051964
 ] 

Uwe Schindler commented on LUCENE-5803:
---

In fact, this makes PER_FIELD_REUSE_:STRATEGY for the PerFieldAnalyzerWrapper 
case obsolete, because in PerFieldAnalyzerWrapper we just leave the components 
caching up to the inner Analyzer, who can use GLOBAL or whatever else. This has 
the good effect, that we dont cache a TokenStream for every field, just for 
every delegate Analyzer.

 Add another AnalyzerWrapper class that does not have its own cache, so 
 delegate-only wrappers don't create thread local resources several times
 ---

 Key: LUCENE-5803
 URL: https://issues.apache.org/jira/browse/LUCENE-5803
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.10


 This is a followup issue for the following Elasticsearch issue: 
 https://github.com/elasticsearch/elasticsearch/pull/6714
 Basically the problem is the following:
 - Elasticsearch has a pool of Analyzers that are used for analysis in several 
 indexes
 - Each index uses a different PerFieldAnalyzerWrapper
 PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it 
 caches the tokenstreams for every field. If there are many fields, this are a 
 lot. In addition, the underlying analyzers may also cache tokenstreams and 
 other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer 
 can always return the same components.
 We should add similar code to Elasticsearch's directly to Lucene: If the 
 delegating Analyzer just delegates per Field or just wraps CharFilters around 
 the Reader, there is no need to cache the TokenStreamComponents a second time 
 in the delegating Analyzers. This is only needed, if the delegating Analyzers 
 adds additional TokenFilters (like ShingleAnalyzerWrapper).
 We should name this new class DelegatingAnalyzerWrapper extends 
 AnalyzerWrapper. The wrapComponents method must be final, because we are not 
 allowed to add additional TokenFilters, but unlike ES, we don't need to 
 disallow wrapping with CharFilters.
 Internally this class uses a private ReuseStrategy that just delegates to the 
 underlying analyzer. It does not matter here if the strategy of the delegate 
 is global or per field, this is private to the delegate.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3451) Remove special handling of pure negative Filters in BooleanFilter, disallow pure negative queries in BooleanQuery

2014-07-03 Thread Jack Krupansky (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051978#comment-14051978
 ] 

Jack Krupansky commented on LUCENE-3451:


Thanks, [~yo...@apache.org]. Although the (a -x) stop word case seems to 
argue even more strenuously for at least an exception if ]\*:\* can't be 
inserted.

Besides, the stop word case is better handled by the Lucid approach of keeping 
all stop words (if they are indexed) if the sub-query terms are all stop words 
as in this case. So it would be only be problematic for the case of non-indexed 
stop words, which is really an anti-pattern anyway these days.

 Remove special handling of pure negative Filters in BooleanFilter, disallow 
 pure negative queries in BooleanQuery
 -

 Key: LUCENE-3451
 URL: https://issues.apache.org/jira/browse/LUCENE-3451
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-3451.patch, LUCENE-3451.patch, LUCENE-3451.patch, 
 LUCENE-3451.patch, LUCENE-3451.patch


 We should at least in Lucene 4.0 remove the hack in BooleanFilter that allows 
 pure negative Filter clauses. This is not supported by BooleanQuery and 
 confuses users (I think that's the problem in LUCENE-3450).
 The hack is buggy, as it does not respect deleted documents and returns them 
 in its DocIdSet.
 Also we should think about disallowing pure-negative Queries at all and throw 
 UOE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3451) Remove special handling of pure negative Filters in BooleanFilter, disallow pure negative queries in BooleanQuery

2014-07-03 Thread Jack Krupansky (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051980#comment-14051980
]

Jack Krupansky commented on LUCENE-3451:

[~yo...@apache.org] says:

bq. I personally think it would be fine to insert *:* for the user where
appropriate.

Ah! Since the divorce that gave Solr custody of its own copy of
QueryParserBase, this change could be made there, right? I can file a Solr Jira
for that (or just use one of the two open Solr issues related to pure-negative
sub-queries), unless you want to do it. And then if the Solr people are happy
over there, the Lucene guys can have their exception here and close this issue,
and the everybody can live happily ever after, right?

Remove special handling of pure negative Filters in BooleanFilter, disallow
pure negative queries in BooleanQuery
-

Key: LUCENE-3451
URL: https://issues.apache.org/jira/browse/LUCENE-3451
Project: Lucene - Core
Issue Type: Improvement
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Fix For: 4.9, 5.0

Attachments: LUCENE-3451.patch, LUCENE-3451.patch, LUCENE-3451.patch,
LUCENE-3451.patch, LUCENE-3451.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4762 - Still Failing

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4762/

All tests passed

Build Log:
[...truncated 45174 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/facet/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.html
 [exec]   missing Constructors: TaxonomyMergeUtils()
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:467:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:63:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/lucene/build.xml:212:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/lucene/build.xml:247:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/lucene/common-build.xml:2338:
 exec returned: 1

Total time: 129 minutes 15 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Sending artifact delta relative to Lucene-Solr-Tests-trunk-Java7 #4753
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 464 bytes
Compression is 0.0%
Took 17 ms
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6223) QueryComponents may throw NPE when using shards.tolerant and there is a failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase

2014-07-03 Thread JIRA

Tomás Fernández Löbbe created SOLR-6223:
---

 Summary: QueryComponents may throw NPE when using shards.tolerant 
and there is a failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase
 Key: SOLR-6223
 URL: https://issues.apache.org/jira/browse/SOLR-6223
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9, 5.0
Reporter: Tomás Fernández Löbbe


I found that, when using shards.tolerant, if there is some kind of exception in 
the second phase of the search, some component’s throw NPE. 
I found it with the QueryComponent first, but then saw that other components 
would suffer in the same way: DebugComponent, HighlightComponent and 
MLTComponent. I only checked the components of the default chain.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6223) SearchComponents may throw NPE when using shards.tolerant and there is a failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase

2014-07-03 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-6223:


Summary: SearchComponents may throw NPE when using shards.tolerant and 
there is a failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase  (was: 
QueryComponents may throw NPE when using shards.tolerant and there is a failure 
in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase)

 SearchComponents may throw NPE when using shards.tolerant and there is a 
 failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase
 ---

 Key: SOLR-6223
 URL: https://issues.apache.org/jira/browse/SOLR-6223
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9, 5.0
Reporter: Tomás Fernández Löbbe

 I found that, when using shards.tolerant, if there is some kind of exception 
 in the second phase of the search, some component’s throw NPE. 
 I found it with the QueryComponent first, but then saw that other components 
 would suffer in the same way: DebugComponent, HighlightComponent and 
 MLTComponent. I only checked the components of the default chain.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6223) SearchComponents may throw NPE when using shards.tolerant and there is a failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase

2014-07-03 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-6223:


Attachment: SOLR-6223.patch

 SearchComponents may throw NPE when using shards.tolerant and there is a 
 failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase
 ---

 Key: SOLR-6223
 URL: https://issues.apache.org/jira/browse/SOLR-6223
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9, 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-6223.patch


 I found that, when using shards.tolerant, if there is some kind of exception 
 in the second phase of the search, some component’s throw NPE. 
 I found it with the QueryComponent first, but then saw that other components 
 would suffer in the same way: DebugComponent, HighlightComponent and 
 MLTComponent. I only checked the components of the default chain.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5803) Add another AnalyzerWrapper class that does not have its own cache, so delegate-only wrappers don't create thread local resources several times


 [ 
https://issues.apache.org/jira/browse/LUCENE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5803:
--

Attachment: LUCENE-5803.patch

Patch.

I added more Javadocs and tried to work around the stupid problem with the 
super constructor call cannot reference to {{this}}. There is the possibibility 
to do this by using the passed-in Analyzer, but then we loose the check 
throwing the IllegalStateException.

We need this check, otherwise you would be able to corrumpt your analyzers: If 
you wrap this analyzer again with some other analyzer that uses the delegate 
reuse strategy, e.g., {{new ShingleAnalysisWrapper(new 
PerFieldAnalyzerWrapper())}}, the ShingleAnalysisWrapper would reuse the 
PerFieldAnalyzerWrapper's strategy (which is private to the PerFieldAnalysis 
wrapper) and by that inject illegal TokenStreamComponents into the inner's 
cache. So we must disallow this.

This patch misses some tests for this special case and also to test if 
everything works fine.

Solr is also using this Analyzer, so we see the improvements in Solr, too (not 
only in Elasticsearch). In fact, PER_FIELD_REUSE_STRATEGY is no longer used for 
pure per-field delegates. We no longer have one TokenStream instance per field, 
we have one instance per delegate Analyzer.

 Add another AnalyzerWrapper class that does not have its own cache, so 
 delegate-only wrappers don't create thread local resources several times
 ---

 Key: LUCENE-5803
 URL: https://issues.apache.org/jira/browse/LUCENE-5803
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5803.patch


 This is a followup issue for the following Elasticsearch issue: 
 https://github.com/elasticsearch/elasticsearch/pull/6714
 Basically the problem is the following:
 - Elasticsearch has a pool of Analyzers that are used for analysis in several 
 indexes
 - Each index uses a different PerFieldAnalyzerWrapper
 PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it 
 caches the tokenstreams for every field. If there are many fields, this are a 
 lot. In addition, the underlying analyzers may also cache tokenstreams and 
 other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer 
 can always return the same components.
 We should add similar code to Elasticsearch's directly to Lucene: If the 
 delegating Analyzer just delegates per Field or just wraps CharFilters around 
 the Reader, there is no need to cache the TokenStreamComponents a second time 
 in the delegating Analyzers. This is only needed, if the delegating Analyzers 
 adds additional TokenFilters (like ShingleAnalyzerWrapper).
 We should name this new class DelegatingAnalyzerWrapper extends 
 AnalyzerWrapper. The wrapComponents method must be final, because we are not 
 allowed to add additional TokenFilters, but unlike ES, we don't need to 
 disallow wrapping with CharFilters.
 Internally this class uses a private ReuseStrategy that just delegates to the 
 underlying analyzer. It does not matter here if the strategy of the delegate 
 is global or per field, this is private to the delegate.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5803) Add another AnalyzerWrapper class that does not have its own cache, so delegate-only wrappers don't create thread local resources several times


[ 
https://issues.apache.org/jira/browse/LUCENE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052015#comment-14052015
 ] 

Uwe Schindler commented on LUCENE-5803:
---

I am not sure about the ideal name for this wrapper. Suggestions?

 Add another AnalyzerWrapper class that does not have its own cache, so 
 delegate-only wrappers don't create thread local resources several times
 ---

 Key: LUCENE-5803
 URL: https://issues.apache.org/jira/browse/LUCENE-5803
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5803.patch


 This is a followup issue for the following Elasticsearch issue: 
 https://github.com/elasticsearch/elasticsearch/pull/6714
 Basically the problem is the following:
 - Elasticsearch has a pool of Analyzers that are used for analysis in several 
 indexes
 - Each index uses a different PerFieldAnalyzerWrapper
 PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it 
 caches the tokenstreams for every field. If there are many fields, this are a 
 lot. In addition, the underlying analyzers may also cache tokenstreams and 
 other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer 
 can always return the same components.
 We should add similar code to Elasticsearch's directly to Lucene: If the 
 delegating Analyzer just delegates per Field or just wraps CharFilters around 
 the Reader, there is no need to cache the TokenStreamComponents a second time 
 in the delegating Analyzers. This is only needed, if the delegating Analyzers 
 adds additional TokenFilters (like ShingleAnalyzerWrapper).
 We should name this new class DelegatingAnalyzerWrapper extends 
 AnalyzerWrapper. The wrapComponents method must be final, because we are not 
 allowed to add additional TokenFilters, but unlike ES, we don't need to 
 disallow wrapping with CharFilters.
 Internally this class uses a private ReuseStrategy that just delegates to the 
 underlying analyzer. It does not matter here if the strategy of the delegate 
 is global or per field, this is private to the delegate.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5803) Add another AnalyzerWrapper class that does not have its own cache, so delegate-only wrappers don't create thread local resources several times

2014-07-03 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052019#comment-14052019
 ] 

Robert Muir commented on LUCENE-5803:
-

I think the name is fine myself. Its for delegation

 Add another AnalyzerWrapper class that does not have its own cache, so 
 delegate-only wrappers don't create thread local resources several times
 ---

 Key: LUCENE-5803
 URL: https://issues.apache.org/jira/browse/LUCENE-5803
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5803.patch


 This is a followup issue for the following Elasticsearch issue: 
 https://github.com/elasticsearch/elasticsearch/pull/6714
 Basically the problem is the following:
 - Elasticsearch has a pool of Analyzers that are used for analysis in several 
 indexes
 - Each index uses a different PerFieldAnalyzerWrapper
 PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it 
 caches the tokenstreams for every field. If there are many fields, this are a 
 lot. In addition, the underlying analyzers may also cache tokenstreams and 
 other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer 
 can always return the same components.
 We should add similar code to Elasticsearch's directly to Lucene: If the 
 delegating Analyzer just delegates per Field or just wraps CharFilters around 
 the Reader, there is no need to cache the TokenStreamComponents a second time 
 in the delegating Analyzers. This is only needed, if the delegating Analyzers 
 adds additional TokenFilters (like ShingleAnalyzerWrapper).
 We should name this new class DelegatingAnalyzerWrapper extends 
 AnalyzerWrapper. The wrapComponents method must be final, because we are not 
 allowed to add additional TokenFilters, but unlike ES, we don't need to 
 disallow wrapping with CharFilters.
 Internally this class uses a private ReuseStrategy that just delegates to the 
 underlying analyzer. It does not matter here if the strategy of the delegate 
 is global or per field, this is private to the delegate.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5803) Add another AnalyzerWrapper class that does not have its own cache, so delegate-only wrappers don't create thread local resources several times

2014-07-03 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052029#comment-14052029
 ] 

Robert Muir commented on LUCENE-5803:
-

And for that reason, you shouldnt be able to wrap it with a charfilter. Use the 
existing subclass for tweaking the analyzer. Let this one be for pure 
delegation...

 Add another AnalyzerWrapper class that does not have its own cache, so 
 delegate-only wrappers don't create thread local resources several times
 ---

 Key: LUCENE-5803
 URL: https://issues.apache.org/jira/browse/LUCENE-5803
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5803.patch


 This is a followup issue for the following Elasticsearch issue: 
 https://github.com/elasticsearch/elasticsearch/pull/6714
 Basically the problem is the following:
 - Elasticsearch has a pool of Analyzers that are used for analysis in several 
 indexes
 - Each index uses a different PerFieldAnalyzerWrapper
 PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it 
 caches the tokenstreams for every field. If there are many fields, this are a 
 lot. In addition, the underlying analyzers may also cache tokenstreams and 
 other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer 
 can always return the same components.
 We should add similar code to Elasticsearch's directly to Lucene: If the 
 delegating Analyzer just delegates per Field or just wraps CharFilters around 
 the Reader, there is no need to cache the TokenStreamComponents a second time 
 in the delegating Analyzers. This is only needed, if the delegating Analyzers 
 adds additional TokenFilters (like ShingleAnalyzerWrapper).
 We should name this new class DelegatingAnalyzerWrapper extends 
 AnalyzerWrapper. The wrapComponents method must be final, because we are not 
 allowed to add additional TokenFilters, but unlike ES, we don't need to 
 disallow wrapping with CharFilters.
 Internally this class uses a private ReuseStrategy that just delegates to the 
 underlying analyzer. It does not matter here if the strategy of the delegate 
 is global or per field, this is private to the delegate.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5790) MutableValue compareTo impls seem to be broken for exists==false

2014-07-03 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-5790:
-

Attachment: LUCENE-5790.patch

updated patch:
* continues along assumption of the previous patch (that callers who set 
exist=false must reset value to default) per yonik's comments
* adds class level javadocs explaining this expecation of the caller
* adds additional tests of each type of MutableValue
* in addition to yonik's MutableValueDouble fix from the previous patch, this 
also includes Ebisawa's MutableValueBool fix.
* Also includes a randomized solr grouping test that heavily stresses docs with 
missing values in the grouping fields, and demonstrates both of the bugs 
Ebisawa mentioned in his email (w/o the fixes of course)


I think this is ready to commit.

 MutableValue compareTo impls seem to be broken for exists==false
 

 Key: LUCENE-5790
 URL: https://issues.apache.org/jira/browse/LUCENE-5790
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
 Attachments: LUCENE-5790.patch, LUCENE-5790.patch, LUCENE-5790.patch


 On the solr-user mailing list, Ebisawa  Alex both commented that they've 
 noticed bugs in the grouping code when some documents don't have values in 
 the grouping field.
 In Ebisawa's case, he tracked this down to what appears to be some bugs in 
 the logic of the compareSameType method of some of the MutableValue 
 implementations.
 Thread: 
 https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201406.mbox/%3c84f86fce4b8f42268048aecfb2806...@sixpr04mb045.apcprd04.prod.outlook.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1688 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1688/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 44799 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/facet/org/apache/lucene/facet/taxonomy/TaxonomyMergeUtils.html
 [exec]   missing Constructors: TaxonomyMergeUtils()
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/build.xml:467: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/build.xml:63: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build.xml:212: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build.xml:247: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/common-build.xml:2338: 
exec returned: 1

Total time: 148 minutes 22 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.7.0 
-XX:-UseCompressedOops -XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10717 - Failure!

2014-07-03 Thread ASF subversion and git services (JIRA)

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10717/
Java: 64bit/jdk1.8.0_20-ea-b15 -XX:+UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 44795 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] Traceback (most recent call last):
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 371, in module
 [exec] if checkPackageSummaries(sys.argv[1], level):
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 351, in checkPackageSummaries
 [exec] if checkClassSummaries(fullPath):
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 215, in checkClassSummaries
 [exec] missing.append((lastCaption, unEscapeURL(lastItem)))
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 303, in unEscapeURL
 [exec] s = s.replace('%20', ' ')
 [exec] AttributeError: 'NoneType' object has no attribute 'replace'

BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:63: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:212: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:247: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:2338:
 exec returned: 1

Total time: 77 minutes 9 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.8.0_20-ea-b15 
-XX:+UseCompressedOops -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader


[ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052133#comment-14052133
 ] 

ASF subversion and git services commented on LUCENE-5801:
-

Commit 1607781 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1607781 ]

LUCENE-5801: rename test vars, class and add missing ctor

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10711 - Still Failing!

2014-07-03 Thread ASF subversion and git services (JIRA)

Committed a fix.


On Thu, Jul 3, 2014 at 7:56 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 +1 to add the commit in RIW if it does an operation like forceMerge in
 close.

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Jul 3, 2014 at 12:23 PM, Shai Erera sh...@apache.org wrote:
  I am still digging - this isn't related to IW.changeCount, but
  SegInfos.counter.
 
  I think there's a bug in RandomIndexWriter - when you call close(), it
  randomly executes a forceMerge(), then closes, without committing.
  In IW.close(), we only check for lost changes prior to 5.0, therefore we
  don't hit the RuntimeException, and the changes made by forceMerge() are
  rolled-back.
  But at that point, _1.fdx was already written (even though it was
 deleted by
  rollback()), therefore addIndexes cannot write it again.
 
  This seems to be a bug in RIW, not in core code, and if I add w.commit()
  after that randomForceMerge, the test passes.
 
  I will review it again later and commit, perhaps someone can take a
 second
  look.
 
  Shai
 
 
  On Thu, Jul 3, 2014 at 6:50 PM, Shai Erera sh...@apache.org wrote:
 
  It reproduces ... something's fishy with IW.changeCount -- seems that
  after you open an IW on an existing Directory, changeCount = 0. I will
 try
  to reproduce this in a standalone test to verify if it's a general IW
 bug or
  not.
 
  Shai
 
 
  On Thu, Jul 3, 2014 at 2:54 PM, Policeman Jenkins Server
  jenk...@thetaphi.de wrote:
 
  Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10711/
  Java: 64bit/jdk1.8.0_20-ea-b15 -XX:+UseCompressedOops -XX:+UseG1GC
 
  1 tests failed.
  FAILED:
 
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils
 
  Error Message:
  file _1.fdx was already written to
 
  Stack Trace:
  java.io.IOException: file _1.fdx was already written to
  at
 
 __randomizedtesting.SeedInfo.seed([2AFF1A4359AFD7EB:C8D30CECD284E330]:0)
  at
 
 org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:492)
  at
 
 org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
  at
 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.init(CompressingStoredFieldsWriter.java:110)
  at
 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:120)
  at
 
 org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:327)
  at
  org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
  at
  org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2590)
  at
 
 org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils.merge(TaxonomyMergeUtils.java:56)
  at
 
 org.apache.lucene.facet.taxonomy.OrdinalMappingReaderTest.testTaxonomyMergeUtils(OrdinalMappingReaderTest.java:73)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:483)
  at
 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
  at
 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
  at
 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
  at
 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
  at
 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
  at
 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
  at
 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
  at
 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
  at
 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
  at
 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
  at
 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at
 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
  at
 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
  at
 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
  at
 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
  at

[jira] [Commented] (LUCENE-5801) Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader


[ 
https://issues.apache.org/jira/browse/LUCENE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052134#comment-14052134
 ] 

ASF subversion and git services commented on LUCENE-5801:
-

Commit 1607782 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1607782 ]

LUCENE-5801: rename test vars, class and add missing ctor

 Resurrect org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 -

 Key: LUCENE-5801
 URL: https://issues.apache.org/jira/browse/LUCENE-5801
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Nicola Buso
Assignee: Shai Erera
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5801.patch, LUCENE-5801.patch, 
 LUCENE-5801_1.patch, LUCENE-5801_2.patch


 from lucene  4.6.1 the class:
 org.apache.lucene.facet.util.OrdinalMappingAtomicReader
 was removed; resurrect it because used merging indexes related to merged 
 taxonomies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5804) Add makeShapeValueSource to SpatialStrategy

David Smiley created LUCENE-5804:


 Summary: Add makeShapeValueSource to SpatialStrategy
 Key: LUCENE-5804
 URL: https://issues.apache.org/jira/browse/LUCENE-5804
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley


The notion of a ValueSource that yields a Shape from 
FunctionValues.objectVal(docId) was introduced with SerializedDVStrategy, and I 
rather like it.  I think the base SpatialStrategy abstraction should be amended 
with it.  In addition, a marker class ShapeValueSource that simply extends 
ValueSource would clarify when/where these special value sources are used with 
a bit of type safety.
{code:java}
  /**
   * Provides access to a Shape per document via ValueSource in which
   * {@link org.apache.lucene.queries.function.FunctionValues#objectVal(int)} 
returns a {@link
   * Shape}.
   */
  public ShapeValueSource makeShapeValueSource() {
throw new UnsupportedOperationException();
  }

  //(use existing javadocs)
  public ValueSource makeDistanceValueSource(Point queryPoint, double 
multiplier) {

return new DistanceToShapeValueSource(makeShapeValueSource(), queryPoint, 
multiplier, ctx);
  }
{code}

SerializedDVStrategy  BBoxStrategy would use this; PointVectorStrategy could 
be modified to.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.8.0) - Build # 1653 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1653/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 44785 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] Traceback (most recent call last):
 [exec]   File 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/dev-tools/scripts/checkJavaDocs.py,
 line 371, in module
 [exec] if checkPackageSummaries(sys.argv[1], level):
 [exec]   File 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/dev-tools/scripts/checkJavaDocs.py,
 line 351, in checkPackageSummaries
 [exec] if checkClassSummaries(fullPath):
 [exec]   File 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/dev-tools/scripts/checkJavaDocs.py,
 line 215, in checkClassSummaries
 [exec] missing.append((lastCaption, unEscapeURL(lastItem)))
 [exec]   File 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/dev-tools/scripts/checkJavaDocs.py,
 line 303, in unEscapeURL
 [exec] s = s.replace('%20', ' ')
 [exec] AttributeError: 'NoneType' object has no attribute 'replace'

BUILD FAILED
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/build.xml:467: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/build.xml:63: The following 
error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build.xml:212: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build.xml:247: The 
following error occurred while executing this line:
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/common-build.xml:2338: 
exec returned: 1

Total time: 138 minutes 59 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.8.0 
-XX:-UseCompressedOops -XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.

[
https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Smiley updated LUCENE-5714:
-

Attachment: LUCENE-5714_Enhance_BBoxStrategy.patch

Latest patch:
* BBoxSimilarity is gone; instead BBoxSimilarityValueSource is abstract (just
one impl though)
* Removed DistanceSimilarity as it's obsoleted by the generic
DistanceToShapeValueSource introduced a couple months ago
* AreaSimilarity renamed to BBoxOverlapRatioValueSource as it's a more
meaningful name
* BBoxOverlapRatioValueSource has a new minSideLength option that is applied to
sides of the query, target, and intersection boxes. It's an optional way to
handle point queries, which without this would basically match everything with
the same score since there is no intersection area.
* Added generic ShapeAreaValueSource (with geoArea boolean option) that
basically just calls shape.getArea(). This is a good way of handling sorting
the results of a point query for indexed rects.
* setPrecisionType is gone; instead I'm trying a new scheme in which you get
and set a FieldType. See LUCENE-5802. Use of DocValues is configurable and
enabled by default.

I think it's probably ready to be committed now.

Improve tests for BBoxStrategy then port to 4x.
---

Key: LUCENE-5714
URL: https://issues.apache.org/jira/browse/LUCENE-5714
Project: Lucene - Core
Issue Type: Improvement
Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
Fix For: 4.10

Attachments: LUCENE-5714_Enhance_BBoxStrategy.patch,
LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch

BBoxStrategy needs better tests before I'm comfortable seeing it in 4x.
Specifically it should use random rectangles based validation (ones that may
cross the dateline), akin to the other tests. And I think I see an
equals/hashcode bug to be fixed in there too.
One particular thing I'd like to see added is how to handle a zero-area case
for AreaSimilarity. I think an additional feature in which you declare a
minimum % area (relative to the query shape) would be good.
It should be possible for the user to combine rectangle center-point to query
shape center-point distance sorting as well. I think it is but I need to
make sure it's possible without _having_ to index a separate center point
field.
Another possibility (probably not to be addressed here) is a minimum ratio
between width/height, perhaps 10%. A long but nearly no height line should
not be massively disadvantaged relevancy-wise to an equivalently long
diagonal road that has a square bbox.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5779) Improve BBox AreaSimilarity algorithm to consider lines and points


[ 
https://issues.apache.org/jira/browse/LUCENE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052163#comment-14052163
 ] 

David Smiley commented on LUCENE-5779:
--

LUCENE-5714's latest patch addresses this issue. It includes a new 
minSideLength option to this algorithm, plus a new ShapeAreaValueSource which 
is probably a better choice when your query is a point and you have indexed 
rects.

 Improve BBox AreaSimilarity algorithm to consider lines and points
 --

 Key: LUCENE-5779
 URL: https://issues.apache.org/jira/browse/LUCENE-5779
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
 Attachments: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch


 GeoPortal's area overlap algorithm didn't consider lines and points; they end 
 up turning the score 0.  I've thought about this for a bit and I've come up 
 with an alternative scoring algorithm.  (already coded and tested and 
 documented):
 New Javadocs:
 {code:java}
 /**
  * The algorithm is implemented as envelope on envelope overlays rather than
  * complex polygon on complex polygon overlays.
  * p/
  * p/
  * Spatial relevance scoring algorithm:
  * DL
  *   DTqueryArea/DT DDthe area of the input query envelope/DD
  *   DTtargetArea/DT DDthe area of the target envelope (per Lucene 
 document)/DD
  *   DTintersectionArea/DT DDthe area of the intersection between the 
 query and target envelopes/DD
  *   DTqueryTargetProportion/DT DDA 0-1 factor that divides the score 
 proportion between query and target.
  *   0.5 is evenly./DD
  *
  *   DTqueryRatio/DT DDintersectionArea / queryArea; (see note)/DD
  *   DTtargetRatio/DT DDintersectionArea / targetArea; (see note)/DD
  *   DTqueryFactor/DT DDqueryRatio * queryTargetProportion;/DD
  *   DTtargetFactor/DT DDtargetRatio * (1 - queryTargetProportion);/DD
  *   DTscore/DT DDqueryFactor + targetFactor;/DD
  * /DL
  * Note: The actual computation of queryRatio and targetRatio is more 
 complicated so that it considers
  * points and lines. Lines have the ratio of overlap, and points are either 
 1.0 or 0.0 depending on wether
  * it intersects or not.
  * p /
  * Based on Geoportal's
  * a 
 href=http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java;
  *   SpatialRankingValueSource/a but modified. GeoPortal's algorithm will 
 yield a score of 0
  * if either a line or point is compared, and it's doesn't output a 0-1 
 normalized score (it multiplies the factors).
  *
  * @lucene.experimental
  */
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0_20-ea-b15) - Build # 10598 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10598/
Java: 64bit/jdk1.8.0_20-ea-b15 -XX:-UseCompressedOops -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 44902 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] Traceback (most recent call last):
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 371, in module
 [exec] if checkPackageSummaries(sys.argv[1], level):
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 351, in checkPackageSummaries
 [exec] if checkClassSummaries(fullPath):
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 215, in checkClassSummaries
 [exec] missing.append((lastCaption, unEscapeURL(lastItem)))
 [exec]   File 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/dev-tools/scripts/checkJavaDocs.py,
 line 303, in unEscapeURL
 [exec] s = s.replace('%20', ' ')
 [exec] AttributeError: 'NoneType' object has no attribute 'replace'

BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:63: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:212: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build.xml:247: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/common-build.xml:2338: 
exec returned: 1

Total time: 76 minutes 5 seconds
Build step 'Invoke Ant' marked build as failure
[description-setter] Description set: Java: 64bit/jdk1.8.0_20-ea-b15 
-XX:-UseCompressedOops -XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6183) Add spatial BBoxField using BBoxSpatialStrategy


 [ 
https://issues.apache.org/jira/browse/SOLR-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-6183:
---

Attachment: SOLR-6183__BBoxFieldType.patch

New patch:
* Syncs with the Lucene-spatial side in LUCENE-5714; in particular, docValues  
and wether to index or not are now supported.  Still limited to doubles though.
* New score=area  score=area2D options to return the area of the indexed 
shape. It's generally computed geodetically, but area2D uses simple  fast math 
(simply width * height) which is usually plenty good enough.
* score=overlapRatio is the new name for the former areaOverlap (or whatever I 
called it) algorithm.  And it has a new minSideLength local-param option.

 Add spatial BBoxField using BBoxSpatialStrategy
 ---

 Key: SOLR-6183
 URL: https://issues.apache.org/jira/browse/SOLR-6183
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.10

 Attachments: SOLR-6183__BBoxFieldType.patch, 
 SOLR-6183__BBoxFieldType.patch


 This introduces a new BBoxField configured like so:
 {code:xml}
 fieldType name=bbox class=solr.BBoxField
numberType=tdouble units=degrees/
 {code}
 It's a field type based on the same backing as the other Solr 4 spatial field 
 types (namely RPT) and thus it inherits the same way to use it, plus what is 
 unique to this field.  Ideally, the numberType would point to double based 
 field type configured with docValues=true but that is not required.  Only 
 TrieDouble no float yet.
 This strategy only accepts indexed rectangles and querying by a rectangle.  
 Indexing a rectangle requires WKT:
 {{ENVELOPE(-10, 20, 15, 10)}} which is minX, maxX, maxY, minY (yeah, that 'y' 
 order is wacky but it's not my spec).  This year I hope to add indexing 
 {{\['lat,lon' TO 'lat,lon']}} but it's not in there yet.
 To query using it's special area overlap ranking, you have to use the special 
 'score' local-param with a new value like so:
 {{q=\{!field f=bbox score=areaOverlap 
 queryTargetProportion=0.25}Intersects(ENVELOPE(10,25,12,10))}}
 The queryTargetProportion defaults to 0.25 to be roughly what GeoPortal uses 
 (although GeoPortal actually has a different formula).  This default weights 
 1 part query factor to 3 parts target factor.
 Add debug=results to see useful explain info.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-6223) SearchComponents may throw NPE when using shards.tolerant and there is a failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase

2014-07-03 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-6223:
---

Assignee: Shalin Shekhar Mangar

 SearchComponents may throw NPE when using shards.tolerant and there is a 
 failure in the “GET_FIELDS/GET_HIGHLIGHTS/GET_DEBUG” phase
 ---

 Key: SOLR-6223
 URL: https://issues.apache.org/jira/browse/SOLR-6223
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.9, 5.0
Reporter: Tomás Fernández Löbbe
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-6223.patch


 I found that, when using shards.tolerant, if there is some kind of exception 
 in the second phase of the search, some component’s throw NPE. 
 I found it with the QueryComponent first, but then saw that other components 
 would suffer in the same way: DebugComponent, HighlightComponent and 
 MLTComponent. I only checked the components of the default chain.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #650: POMs out of sync