Re: [jira] [Commented] (PYLUCENE-27) JCC should be able to create sdist archives

2013-11-01 Thread Martin Scherer
So it is not documentated, but according to this bug
http://bugs.python.org/issue6884
it is prohibited to include files relative to 'build' directory.

So if one renames the output directory via --output parameter to
something else, everything works as expected.

So feel free to close this report.

But nevertheless I would recommend using a output directory which is by
default not 'build' to avoid such situations in the future.

Best,
Martin

Am 01.11.2013 15:48, schrieb Martin Scherer:
 It is not clear to me, why the source code is missing, because the
 Extension is defined properly (couldn't be built if thats not the case).
 
 IMO it should be sufficient to provide the sdist keyword, which can
 already be passed by the '--extra-setup-arg' parameter.
 
 Somebody have any hints about that?
 
 Unfortunately I'am no expert of distutils/setuptools.
 
 Am 01.11.2013 00:10, schrieb Andi Vajda (JIRA):

 [ 
 https://issues.apache.org/jira/browse/PYLUCENE-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810843#comment-13810843
  ] 

 Andi Vajda commented on PYLUCENE-27:
 

 I have no idea how to do this or if this is even possible (I assume so).
 A patch implementing this would be more than welcome.

 JCC should be able to create sdist archives
 ---

 Key: PYLUCENE-27
 URL: https://issues.apache.org/jira/browse/PYLUCENE-27
 Project: PyLucene
  Issue Type: Wish
 Environment: jcc-svn-head
Reporter: Martin

 I was not able to create a complete (in terms one is able to compile and 
 install the desired wrapper) source distribution.
 I've tried following calls:
   python -m jcc --jar foo  --egg-info --extra-setup-arg sdist
 and
  python -m jcc --jar foo --extra-setup-arg sdist
 Both create archives only containing the egg-info and setup.py but no 
 source code at all.
 I really need this feature for my testing environment with tox, since this 
 heavily depends on the sdist feature.
 thanks,
 best,
 Martin



 --
 This message was sent by Atlassian JIRA
 (v6.1#6144)




[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811069#comment-13811069
 ] 

ASF subversion and git services commented on LUCENE-5189:
-

Commit 1537832 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537832 ]

LUCENE-5189: add NumericDocValues field updates

 Numeric DocValues Updates
 -

 Key: LUCENE-5189
 URL: https://issues.apache.org/jira/browse/LUCENE-5189
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5189-4x.patch, LUCENE-5189-4x.patch, 
 LUCENE-5189-no-lost-updates.patch, LUCENE-5189-segdv.patch, 
 LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch


 In LUCENE-4258 we started to work on incremental field updates, however the 
 amount of changes are immense and hard to follow/consume. The reason is that 
 we targeted postings, stored fields, DV etc., all from the get go.
 I'd like to start afresh here, with numeric-dv-field updates only. There are 
 a couple of reasons to that:
 * NumericDV fields should be easier to update, if e.g. we write all the 
 values of all the documents in a segment for the updated field (similar to 
 how livedocs work, and previously norms).
 * It's a fairly contained issue, attempting to handle just one data type to 
 update, yet requires many changes to core code which will also be useful for 
 updating other data types.
 * It has value in and on itself, and we don't need to allow updating all the 
 data types in Lucene at once ... we can do that gradually.
 I have some working patch already which I'll upload next, explaining the 
 changes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates

2013-11-01 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811070#comment-13811070
 ] 

Shai Erera commented on LUCENE-5189:


Backported to 4x. I'll handle the TODO (DVU_RENAME) tasks (rote renaming of 
internal API).

 Numeric DocValues Updates
 -

 Key: LUCENE-5189
 URL: https://issues.apache.org/jira/browse/LUCENE-5189
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5189-4x.patch, LUCENE-5189-4x.patch, 
 LUCENE-5189-no-lost-updates.patch, LUCENE-5189-segdv.patch, 
 LUCENE-5189-updates-order.patch, LUCENE-5189-updates-order.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189_process_events.patch, LUCENE-5189_process_events.patch


 In LUCENE-4258 we started to work on incremental field updates, however the 
 amount of changes are immense and hard to follow/consume. The reason is that 
 we targeted postings, stored fields, DV etc., all from the get go.
 I'd like to start afresh here, with numeric-dv-field updates only. There are 
 a couple of reasons to that:
 * NumericDV fields should be easier to update, if e.g. we write all the 
 values of all the documents in a segment for the updated field (similar to 
 how livedocs work, and previously norms).
 * It's a fairly contained issue, attempting to handle just one data type to 
 update, yet requires many changes to core code which will also be useful for 
 updating other data types.
 * It has value in and on itself, and we don't need to allow updating all the 
 data types in Lucene at once ... we can do that gradually.
 I have some working patch already which I'll upload next, explaining the 
 changes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5189) Numeric DocValues Updates

2013-11-01 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5189:
---

Attachment: LUCENE-5189-renames.patch

Handle TODO (DVU_RENAME) comments. It's really a rote renaming + I renamed 
some members that referred to these objects.

 Numeric DocValues Updates
 -

 Key: LUCENE-5189
 URL: https://issues.apache.org/jira/browse/LUCENE-5189
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5189-4x.patch, LUCENE-5189-4x.patch, 
 LUCENE-5189-no-lost-updates.patch, LUCENE-5189-renames.patch, 
 LUCENE-5189-segdv.patch, LUCENE-5189-updates-order.patch, 
 LUCENE-5189-updates-order.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189_process_events.patch, 
 LUCENE-5189_process_events.patch


 In LUCENE-4258 we started to work on incremental field updates, however the 
 amount of changes are immense and hard to follow/consume. The reason is that 
 we targeted postings, stored fields, DV etc., all from the get go.
 I'd like to start afresh here, with numeric-dv-field updates only. There are 
 a couple of reasons to that:
 * NumericDV fields should be easier to update, if e.g. we write all the 
 values of all the documents in a segment for the updated field (similar to 
 how livedocs work, and previously norms).
 * It's a fairly contained issue, attempting to handle just one data type to 
 update, yet requires many changes to core code which will also be useful for 
 updating other data types.
 * It has value in and on itself, and we don't need to allow updating all the 
 data types in Lucene at once ... we can do that gradually.
 I have some working patch already which I'll upload next, explaining the 
 changes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 65322 - Failure!

2013-11-01 Thread builder
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/65322/

All tests passed

Build Log:
[...truncated 710 lines...]
   [junit4] JVM J2: stdout was not empty, see: 
/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/build/core/test/temp/junit4-J2-20131101_094418_004.sysout
   [junit4]  JVM J2: stdout (verbatim) 
   [junit4] #
   [junit4] # A fatal error has been detected by the Java Runtime Environment:
   [junit4] #
   [junit4] #  SIGSEGV (0xb) at pc=0x7f41d4991430, pid=682, 
tid=139920424404736
   [junit4] #
   [junit4] # JRE version: 7.0_05-b05
   [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (23.1-b03 mixed mode 
linux-amd64 compressed oops)
   [junit4] # Problematic frame:
   [junit4] # V  [libjvm.so+0x770430]  Parse::do_one_bytecode()+0x3290
   [junit4] #
   [junit4] # Failed to write core dump. Core dumps have been disabled. To 
enable core dumping, try ulimit -c unlimited before starting Java again
   [junit4] #
   [junit4] # An error report file with more information is saved as:
   [junit4] # 
/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/build/core/test/J2/hs_err_pid682.log
   [junit4] #
   [junit4] # If you would like to submit a bug report, please visit:
   [junit4] #   http://bugreport.sun.com/bugreport/crash.jsp
   [junit4] #
   [junit4]  JVM J2: EOF 

[...truncated 656 lines...]
   [junit4] ERROR: JVM J2 ended with an exception, command line: 
/opt/java/64/jdk1.7.0_05/jre/bin/java -Dtests.prefix=tests 
-Dtests.seed=98A70CECE2B296F1 -Xmx512M -Dtests.iters= -Dtests.verbose=false 
-Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random 
-Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random 
-Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz 
-Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perMethod 
-Djava.util.logging.config.file=/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/build/core/test/temp
 
-Dclover.db.dir=/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -classpath 

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 8031 - Still Failing!

2013-11-01 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/8031/
Java: 32bit/jdk1.6.0_45 -client -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 32453 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:428: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:367: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:66: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:135: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* 
./solr/core/src/java/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.java
* ./solr/core/src/test/org/apache/solr/cloud/TestDistribDocBasedVersion.java
* ./solr/core/src/test/org/apache/solr/search/TestStressUserVersions.java
* 
./solr/core/src/test/org/apache/solr/update/TestDocBasedVersionConstraints.java

Total time: 53 minutes 19 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.6.0_45 -client -XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml

2013-11-01 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811139#comment-13811139
 ] 

Arcadius Ahouansou commented on SOLR-2724:
--

Hello.

Looking at the schema.xml that ships with solr 4.5.1, we still have 

{code:xml}
 !-- field for the QueryParser to use when an explicit fieldname is absent --
 defaultSearchFieldtext/defaultSearchField

 !-- SolrQueryParser configuration: defaultOperator=AND|OR --
 solrQueryParser defaultOperator=OR/
{code}

Shouldn't those fields be removed?

I tried to remove them but some of our tests are failing, meaning that some 
Solr components are still using them.
I don't yet know which components though.


 Deprecate defaultSearchField and defaultOperator defined in schema.xml
 --

 Key: SOLR-2724
 URL: https://issues.apache.org/jira/browse/SOLR-2724
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis, search
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 3.6, 4.0-ALPHA

 Attachments: 
 SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch, 
 SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 I've always been surprised to see the defaultSearchField element and 
 solrQueryParser defaultOperator=OR/ defined in the schema.xml file since 
 the first time I saw them.  They just seem out of place to me since they are 
 more query parser related than schema related. But not only are they 
 misplaced, I feel they shouldn't exist. For query parsers, we already have a 
 df parameter that works just fine, and explicit field references. And the 
 default lucene query operator should stay at OR -- if a particular query 
 wants different behavior then use q.op or simply use OR.
 similarity Seems like something better placed in solrconfig.xml than in the 
 schema. 
 In my opinion, defaultSearchField and defaultOperator configuration elements 
 should be deprecated in Solr 3.x and removed in Solr 4.  And similarity 
 should move to solrconfig.xml. I am willing to do it, provided there is 
 consensus on it of course.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 2126 - Still Failing

2013-11-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/2126/

All tests passed

Build Log:
[...truncated 32665 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:428:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:367:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/extra-targets.xml:66:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/extra-targets.xml:135:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* 
./solr/core/src/java/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.java
* ./solr/core/src/test/org/apache/solr/cloud/TestDistribDocBasedVersion.java
* ./solr/core/src/test/org/apache/solr/search/TestStressUserVersions.java
* 
./solr/core/src/test/org/apache/solr/update/TestDocBasedVersionConstraints.java

Total time: 75 minutes 0 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2724) Deprecate defaultSearchField and defaultOperator defined in schema.xml

2013-11-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811242#comment-13811242
 ] 

David Smiley commented on SOLR-2724:


I just checked but I'm not seeing it in 
./example/solr/collection1/conf/schema.xml which is the main reference schema.  
The multicore example still has that.  I'm more of the opinion ./multicore 
should be simply omitted altogether than improved.

I'm not surprised some tests depend on some schemas having these defaults.  
These schema elements were marked deprecated which means they are discouraged 
but still supported.  They were written before SOLR-2724, most likely.  Nothing 
should fundamentally depend on these being specified in schema.xml; there are 
other better places like 'df'  'q.op' in local-params. 

 Deprecate defaultSearchField and defaultOperator defined in schema.xml
 --

 Key: SOLR-2724
 URL: https://issues.apache.org/jira/browse/SOLR-2724
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis, search
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 3.6, 4.0-ALPHA

 Attachments: 
 SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch, 
 SOLR-2724_deprecateDefaultSearchField_and_defaultOperator.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 I've always been surprised to see the defaultSearchField element and 
 solrQueryParser defaultOperator=OR/ defined in the schema.xml file since 
 the first time I saw them.  They just seem out of place to me since they are 
 more query parser related than schema related. But not only are they 
 misplaced, I feel they shouldn't exist. For query parsers, we already have a 
 df parameter that works just fine, and explicit field references. And the 
 default lucene query operator should stay at OR -- if a particular query 
 wants different behavior then use q.op or simply use OR.
 similarity Seems like something better placed in solrconfig.xml than in the 
 schema. 
 In my opinion, defaultSearchField and defaultOperator configuration elements 
 should be deprecated in Solr 3.x and removed in Solr 4.  And similarity 
 should move to solrconfig.xml. I am willing to do it, provided there is 
 consensus on it of course.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



dangers of limiting tokenizers/disabling assertions in MockTokenizer?

2013-11-01 Thread Allison, Timothy B.
All,
  I realize that we should be consuming all tokens from a stream.  I'd like to 
wrap a client's Analyzer with LimitTokenCountAnalyzer with consume=false. For 
the analyzers that I've used, this has caused no problems.  When I use 
MockTokenizer, I run into this assertion error: end() called before 
incrementToken().  The comment in MockTokenizer reads:

// some tokenizers, such as limiting tokenizers, call end() before 
incrementToken() returns false.
// these tests should disable this check (in general you should consume the 
entire stream)

 Disabling assertions gives me pause as does disobeying the workflow 
(http://lucene.apache.org/core/4_5_1/core/index.html).  I assume from the 
warnings that there are Analyzers and use cases that will fail unless the 
stream is entirely consumed.

  Is there a safe way to wrap a client Analyzer and only read x number of 
tokens?  Should I allow the client to decide whether or not to consume?

  Thank you!

 Best,

  Tim



[jira] [Commented] (SOLR-5392) extend solrj apis to cover collection management

2013-11-01 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811268#comment-13811268
 ] 

Mark Miller commented on SOLR-5392:
---

Jenkins showing some more random fails when there are 2 config sets - I've 
changed a couple of the create collection methods to also take the config set 
name.

 extend solrj apis to cover collection management
 

 Key: SOLR-5392
 URL: https://issues.apache.org/jira/browse/SOLR-5392
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.5
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Attachments: 
 0001-SOLR-5392.-extend-solrj-apis-to-cover-collection-man.patch, 
 SOLR-5392.patch


 It would be useful to extend solrj APIs to cover collection management calls: 
 https://cwiki.apache.org/confluence/display/solr/Collections+API 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-11-01 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811269#comment-13811269
 ] 

Shai Erera commented on LUCENE-5296:


Now that we have this codec, does it make sense to keep FacetDVF? As far as I 
can tell, the only difference is that FacetDVF keeps the addresses as 
PackedInts while DirectDVF as int[]?

 Add DirectDocValuesFormat
 -

 Key: LUCENE-5296
 URL: https://issues.apache.org/jira/browse/LUCENE-5296
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5296.patch


 Indexes values to disk but at search time it loads/accesses the values via 
 simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5392) extend solrj apis to cover collection management

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811271#comment-13811271
 ] 

ASF subversion and git services commented on SOLR-5392:
---

Commit 1537941 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1537941 ]

SOLR-5392: Add conf set name to solrj collection create methods

 extend solrj apis to cover collection management
 

 Key: SOLR-5392
 URL: https://issues.apache.org/jira/browse/SOLR-5392
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.5
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Attachments: 
 0001-SOLR-5392.-extend-solrj-apis-to-cover-collection-man.patch, 
 SOLR-5392.patch


 It would be useful to extend solrj APIs to cover collection management calls: 
 https://cwiki.apache.org/confluence/display/solr/Collections+API 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5392) extend solrj apis to cover collection management

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811277#comment-13811277
 ] 

ASF subversion and git services commented on SOLR-5392:
---

Commit 1537943 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537943 ]

SOLR-5392: Add conf set name to solrj collection create methods

 extend solrj apis to cover collection management
 

 Key: SOLR-5392
 URL: https://issues.apache.org/jira/browse/SOLR-5392
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.5
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Attachments: 
 0001-SOLR-5392.-extend-solrj-apis-to-cover-collection-man.patch, 
 SOLR-5392.patch


 It would be useful to extend solrj APIs to cover collection management calls: 
 https://cwiki.apache.org/confluence/display/solr/Collections+API 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5392) extend solrj apis to cover collection management

2013-11-01 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5392.
---

   Resolution: Fixed
Fix Version/s: 5.0
   4.6

Thanks Roman!

 extend solrj apis to cover collection management
 

 Key: SOLR-5392
 URL: https://issues.apache.org/jira/browse/SOLR-5392
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.5
Reporter: Roman Shaposhnik
Assignee: Mark Miller
 Fix For: 4.6, 5.0

 Attachments: 
 0001-SOLR-5392.-extend-solrj-apis-to-cover-collection-man.patch, 
 SOLR-5392.patch


 It would be useful to extend solrj APIs to cover collection management calls: 
 https://cwiki.apache.org/confluence/display/solr/Collections+API 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 65322 - Failure!

2013-11-01 Thread Robert Muir
On Fri, Nov 1, 2013 at 4:48 AM,  buil...@flonkings.com wrote:

[junit4]  JVM J2: stdout (verbatim) 
[junit4] #
[junit4] # A fatal error has been detected by the Java Runtime Environment:
[junit4] #
[junit4] #  SIGSEGV (0xb) at pc=0x7f41d4991430, pid=682, 
 tid=139920424404736
[junit4] #
[junit4] # JRE version: 7.0_05-b05
[junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (23.1-b03 mixed mode 
 linux-amd64 compressed oops)
[junit4] # Problematic frame:
[junit4] # V  [libjvm.so+0x770430]  Parse::do_one_bytecode()+0x3290
[junit4] #
[junit4] # Failed to write core dump. Core dumps have been disabled. To 
 enable core dumping, try ulimit -c unlimited before starting Java again
[junit4] #
[junit4] # An error report file with more information is saved as:
[junit4] # 
 /var/lib/jenkins/workspace/Lucene-trunk-Linux-Java7-64-test-only/checkout/lucene/build/core/test/J2/hs_err_pid682.log
[junit4] #
[junit4] # If you would like to submit a bug report, please visit:
[junit4] #   http://bugreport.sun.com/bugreport/crash.jsp
[junit4] #
[junit4]  JVM J2: EOF 


this one is new

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: dangers of limiting tokenizers/disabling assertions in MockTokenizer?

2013-11-01 Thread Robert Muir
On Fri, Nov 1, 2013 at 9:30 AM, Allison, Timothy B. talli...@mitre.org wrote:

  Disabling assertions gives me pause as does disobeying the workflow
 (http://lucene.apache.org/core/4_5_1/core/index.html).  I assume from the
 warnings that there are Analyzers and use cases that will fail unless the
 stream is entirely consumed.

The option has to be there, if this check was disabled by default,
then it would make too much leniency overall and lots of other useful
checks wouldnt work either.

Users also already have an option to the limiter 'consumeAllTokens' if
their analyzer has bugs here.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (PYLUCENE-27) JCC should be able to create sdist archives

2013-11-01 Thread Martin Scherer
It is not clear to me, why the source code is missing, because the
Extension is defined properly (couldn't be built if thats not the case).

IMO it should be sufficient to provide the sdist keyword, which can
already be passed by the '--extra-setup-arg' parameter.

Somebody have any hints about that?

Unfortunately I'am no expert of distutils/setuptools.

Am 01.11.2013 00:10, schrieb Andi Vajda (JIRA):
 
 [ 
 https://issues.apache.org/jira/browse/PYLUCENE-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13810843#comment-13810843
  ] 
 
 Andi Vajda commented on PYLUCENE-27:
 
 
 I have no idea how to do this or if this is even possible (I assume so).
 A patch implementing this would be more than welcome.
 
 JCC should be able to create sdist archives
 ---

 Key: PYLUCENE-27
 URL: https://issues.apache.org/jira/browse/PYLUCENE-27
 Project: PyLucene
  Issue Type: Wish
 Environment: jcc-svn-head
Reporter: Martin

 I was not able to create a complete (in terms one is able to compile and 
 install the desired wrapper) source distribution.
 I've tried following calls:
   python -m jcc --jar foo  --egg-info --extra-setup-arg sdist
 and
  python -m jcc --jar foo --extra-setup-arg sdist
 Both create archives only containing the egg-info and setup.py but no source 
 code at all.
 I really need this feature for my testing environment with tox, since this 
 heavily depends on the sdist feature.
 thanks,
 best,
 Martin
 
 
 
 --
 This message was sent by Atlassian JIRA
 (v6.1#6144)
 



[jira] [Commented] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).

2013-11-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811308#comment-13811308
 ] 

Michael McCandless commented on LUCENE-5283:


Wow, +1 to commit :)

I wish we used Python as our build tool!

 Fail the build if ant test didn't execute any tests (everything filtered out).
 --

 Key: LUCENE-5283
 URL: https://issues.apache.org/jira/browse/LUCENE-5283
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5283.patch, LUCENE-5283.patch


 This should be an optional setting that defaults to 'false' (the build 
 proceeds).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-11-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811310#comment-13811310
 ] 

Michael McCandless commented on LUCENE-5296:


bq. Now that we have this codec, does it make sense to keep FacetDVF? As far as 
I can tell, the only difference is that FacetDVF keeps the addresses as 
PackedInts while DirectDVF as int[]?

Hmm that's a good question.  I'll test the two...

 Add DirectDocValuesFormat
 -

 Key: LUCENE-5296
 URL: https://issues.apache.org/jira/browse/LUCENE-5296
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5296.patch


 Indexes values to disk but at search time it loads/accesses the values via 
 simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1015: POMs out of sync

2013-11-01 Thread Steve Rowe
The problem here is that the copied/filtered solr-core POM is missing a lot of 
dependencies: log4j, noggit, commons-io, httpcomponents.  These are added to 
the solr-core “classpath” in the Ant build from solrj and example lib/ 
directories.  When these are collected from the Ant build in preparation for 
filtering the POMs, only files that actually exist make it into the 
“classpath”, and apparently at the point the solr-core “classpath” is examined, 
there is nothing in the solrj/lib/ and example/lib/ directories.

I can reproduce this locally, if I first rm -rf $(find . -name ‘*.jar’), like 
the Jenkins build does.  Looks like when “generate-maven-artifacts” calls 
“filter-pom-templates”, it doesn’t invoke “resolve” (unlike “get-maven-poms”, 
which does).  Fix should be simple - I’ll work on it later today.

Steve

On Oct 31, 2013, at 10:50 PM, Apache Jenkins Server jenk...@builds.apache.org 
wrote:

 Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1015/
 
 All tests passed
 
 Build Log:
 [...truncated 36479 lines...]
  [mvn] [INFO] 
 -
  [mvn] [INFO] 
 -
  [mvn] [ERROR] COMPILATION ERROR : 
  [mvn] [INFO] 
 -
 
 [...truncated 593 lines...]
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5320) Create SearcherTaxonomyManager over Directory

2013-11-01 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-5320:
--

 Summary: Create SearcherTaxonomyManager over Directory
 Key: LUCENE-5320
 URL: https://issues.apache.org/jira/browse/LUCENE-5320
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/facet
Reporter: Shai Erera


SearcherTaxonomyManager now only allows working in NRT mode. It could be useful 
to have an STM which allows reopening a SearcherAndTaxonomy pair over 
Directories, e.g. for replication. The problem is that if the thread that calls 
maybeRefresh() is not the one that does the commit(), it could lead to a pair 
that is not synchronized.

Perhaps at first we could have a simple version that works under some 
assumptions, i.e. that the app does the commit + reopen in the same thread in 
that order, so that it can be used by such apps + when replicating the indexes, 
and later we can figure out how to generalize it to work even if commit + 
reopen are done by separate threads/JVMs.

I'll see if SearcherTaxonomyManager can be extended to support it, or a new STM 
is required.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5311) Avoid registering replicas which are removed

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811408#comment-13811408
 ] 

ASF subversion and git services commented on SOLR-5311:
---

Commit 1537978 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1537978 ]

SOLR-5311, invalid error message

 Avoid registering replicas which are removed 
 -

 Key: SOLR-5311
 URL: https://issues.apache.org/jira/browse/SOLR-5311
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 4.6, 5.0

 Attachments: SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, 
 SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch


 If a replica is removed from the clusterstate and if it comes back up it 
 should not be allowed to register. 
 Each core ,when comes up, checks if it was already registered and if yes is 
 it still there. If not ,throw an error and do an unregister . If such a 
 request come to overseer it should ignore such a core



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).

2013-11-01 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811416#comment-13811416
 ] 

Dawid Weiss commented on LUCENE-5283:
-

I think you would like gradle -- it's essentially a fully scripted build system 
(in groovy) but very well thought over.

 Fail the build if ant test didn't execute any tests (everything filtered out).
 --

 Key: LUCENE-5283
 URL: https://issues.apache.org/jira/browse/LUCENE-5283
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5283.patch, LUCENE-5283.patch


 This should be an optional setting that defaults to 'false' (the build 
 proceeds).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5354) Distributed sort is broken with CUSTOM FieldType

2013-11-01 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811422#comment-13811422
 ] 

Steve Rowe commented on SOLR-5354:
--

Thanks for the review Robert.

bq. Can we please not have the Object/Object stuff in FieldComparatorSource? 
This is wrong: FieldComparator already has a generic type so I don't understand 
the need to discard type safety.

I'm not sure what you have in mind - do you think FieldComparatorSource should 
be generified? In this case I think each extending class will need to provide 
an implementation for these methods, since there isn't a sensible way to 
provide a default implementation of conversion to/from the generic type.

 bq. The unicode conversion for String/String_VAL is incorrect and should not 
exist: despite the name, these types can be any bytes

This is the status quo right now - the patch just keeps that in place.  But I 
agree.  I think the issue is non-binary (XML) serialization, for which UTF-8 is 
safe, but arbitrary binary is not.  Serializing all STRING/STRING_VAL as Base64 
seems wasteful in the general case.

Relatedly, looks like there's an orphaned {{SortField.Type.BYTES}} (orphaned in 
that it's not handled in lots of places) - I guess this should go away?

{quote}
As a concrete example the CollationField and ICUCollationField sort with 
String/String_VAL comparators but contain non-unicode bytes.

These currently do not work distributed today either (which I would love to see 
fixed on this issue).
{quote}

I'm working on a distributed version of the Solr (icu) collation tests.  Once I 
get that failing, I'll be able to test potential solutions.

 Distributed sort is broken with CUSTOM FieldType
 

 Key: SOLR-5354
 URL: https://issues.apache.org/jira/browse/SOLR-5354
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Steve Rowe
  Labels: custom, query, sort
 Attachments: SOLR-5354.patch


 We added a custom field type to allow an indexed binary field type that 
 supports search (exact match), prefix search, and sort as unsigned bytes 
 lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator 
 accomplishes what we want, and even though the name of the comparator 
 mentions UTF8, it doesn't actually assume so and just does byte-level 
 operation, so it's good. However, when we do this across different nodes, we 
 run into an issue where in QueryComponent.doFieldSortValues:
   // Must do the same conversion when sorting by a
   // String field in Lucene, which returns the terms
   // data as BytesRef:
   if (val instanceof BytesRef) {
 UnicodeUtil.UTF8toUTF16((BytesRef)val, spare);
 field.setStringValue(spare.toString());
 val = ft.toObject(field);
   }
 UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually 
 UTF8. I did a hack where I specified our own field comparator to be 
 ByteBuffer based to get around that instanceof check, but then the field 
 value gets transformed into BYTEARR in JavaBinCodec, and when it's 
 unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a 
 ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, 
 which decides to give me comparatorNatural in the else of the TODO for 
 CUSTOM, which barfs because byte[] are not Comparable...
 From Chris Hostetter:
 I'm not very familiar with the distributed sorting code, but based on your
 comments, and a quick skim of the functions you pointed to, it definitely
 seems like there are two problems here for people trying to implement
 custom sorting in custom FieldTypes...
 1) QueryComponent.doFieldSortValues - this definitely seems like it should
 be based on the FieldType, not an instanceof BytesRef check (oddly: the
 comment event suggestsion that it should be using the FieldType's
 indexedToReadable() method -- but it doesn't do that.  If it did, then
 this part of hte logic should work for you as long as your custom
 FieldType implemented indexedToReadable in a sane way.
 2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
 needs filled.  I'm guessing the sanest thing to do in the CUSTOM case
 would be to ask the FieldComparatorSource (which should be coming from the
 SortField that the custom FieldType produced) to create a FieldComparator
 (via newComparator - the numHits  sortPos could be anything) and then
 wrap that up in a Comparator facade that delegates to
 FieldComparator.compareValues
 That way a custom FieldType could be in complete control of the sort
 comparisons (even when merging ids).
 ...But as i said: i may be missing something, i'm not super 

[jira] [Commented] (SOLR-5311) Avoid registering replicas which are removed

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811420#comment-13811420
 ] 

ASF subversion and git services commented on SOLR-5311:
---

Commit 1537981 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1537981 ]

SOLR-5311, invalid error message

 Avoid registering replicas which are removed 
 -

 Key: SOLR-5311
 URL: https://issues.apache.org/jira/browse/SOLR-5311
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 4.6, 5.0

 Attachments: SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, 
 SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch, SOLR-5311.patch


 If a replica is removed from the clusterstate and if it comes back up it 
 should not be allowed to register. 
 Each core ,when comes up, checks if it was already registered and if yes is 
 it still there. If not ,throw an error and do an unregister . If such a 
 request come to overseer it should ignore such a core



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5354) Distributed sort is broken with CUSTOM FieldType

2013-11-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811432#comment-13811432
 ] 

Robert Muir commented on SOLR-5354:
---

{quote}
This is the status quo right now - the patch just keeps that in place. 
{quote}

No its not: its a bug in solr. This patch moves that bug into Lucene. 

Lucene's APIs here work correctly on any bytes today.

 Distributed sort is broken with CUSTOM FieldType
 

 Key: SOLR-5354
 URL: https://issues.apache.org/jira/browse/SOLR-5354
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Steve Rowe
  Labels: custom, query, sort
 Attachments: SOLR-5354.patch


 We added a custom field type to allow an indexed binary field type that 
 supports search (exact match), prefix search, and sort as unsigned bytes 
 lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator 
 accomplishes what we want, and even though the name of the comparator 
 mentions UTF8, it doesn't actually assume so and just does byte-level 
 operation, so it's good. However, when we do this across different nodes, we 
 run into an issue where in QueryComponent.doFieldSortValues:
   // Must do the same conversion when sorting by a
   // String field in Lucene, which returns the terms
   // data as BytesRef:
   if (val instanceof BytesRef) {
 UnicodeUtil.UTF8toUTF16((BytesRef)val, spare);
 field.setStringValue(spare.toString());
 val = ft.toObject(field);
   }
 UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually 
 UTF8. I did a hack where I specified our own field comparator to be 
 ByteBuffer based to get around that instanceof check, but then the field 
 value gets transformed into BYTEARR in JavaBinCodec, and when it's 
 unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a 
 ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, 
 which decides to give me comparatorNatural in the else of the TODO for 
 CUSTOM, which barfs because byte[] are not Comparable...
 From Chris Hostetter:
 I'm not very familiar with the distributed sorting code, but based on your
 comments, and a quick skim of the functions you pointed to, it definitely
 seems like there are two problems here for people trying to implement
 custom sorting in custom FieldTypes...
 1) QueryComponent.doFieldSortValues - this definitely seems like it should
 be based on the FieldType, not an instanceof BytesRef check (oddly: the
 comment event suggestsion that it should be using the FieldType's
 indexedToReadable() method -- but it doesn't do that.  If it did, then
 this part of hte logic should work for you as long as your custom
 FieldType implemented indexedToReadable in a sane way.
 2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
 needs filled.  I'm guessing the sanest thing to do in the CUSTOM case
 would be to ask the FieldComparatorSource (which should be coming from the
 SortField that the custom FieldType produced) to create a FieldComparator
 (via newComparator - the numHits  sortPos could be anything) and then
 wrap that up in a Comparator facade that delegates to
 FieldComparator.compareValues
 That way a custom FieldType could be in complete control of the sort
 comparisons (even when merging ids).
 ...But as i said: i may be missing something, i'm not super familia with
 that code.  Please try it out and let us know if thta works -- either way
 please open a Jira pointing out the problems trying to implement
 distributed sorting in a custom FieldType.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5354) Distributed sort is broken with CUSTOM FieldType

2013-11-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811433#comment-13811433
 ] 

Robert Muir commented on SOLR-5354:
---

{quote}
I think the issue is non-binary (XML) serialization, for which UTF-8 is safe, 
but arbitrary binary is not. Serializing all STRING/STRING_VAL as Base64 seems 
wasteful in the general case.
{quote}

This is all solr stuff. I don't think it makes sense to move that logic into 
lucene, let the user deal with this. They might not be using XML at all: maybe 
thrift or avro or something else.

Why not just add serialize/deserialize methods to solr's FieldType.java? It 
seems like the obvious place. 

 Distributed sort is broken with CUSTOM FieldType
 

 Key: SOLR-5354
 URL: https://issues.apache.org/jira/browse/SOLR-5354
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Steve Rowe
  Labels: custom, query, sort
 Attachments: SOLR-5354.patch


 We added a custom field type to allow an indexed binary field type that 
 supports search (exact match), prefix search, and sort as unsigned bytes 
 lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator 
 accomplishes what we want, and even though the name of the comparator 
 mentions UTF8, it doesn't actually assume so and just does byte-level 
 operation, so it's good. However, when we do this across different nodes, we 
 run into an issue where in QueryComponent.doFieldSortValues:
   // Must do the same conversion when sorting by a
   // String field in Lucene, which returns the terms
   // data as BytesRef:
   if (val instanceof BytesRef) {
 UnicodeUtil.UTF8toUTF16((BytesRef)val, spare);
 field.setStringValue(spare.toString());
 val = ft.toObject(field);
   }
 UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually 
 UTF8. I did a hack where I specified our own field comparator to be 
 ByteBuffer based to get around that instanceof check, but then the field 
 value gets transformed into BYTEARR in JavaBinCodec, and when it's 
 unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a 
 ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, 
 which decides to give me comparatorNatural in the else of the TODO for 
 CUSTOM, which barfs because byte[] are not Comparable...
 From Chris Hostetter:
 I'm not very familiar with the distributed sorting code, but based on your
 comments, and a quick skim of the functions you pointed to, it definitely
 seems like there are two problems here for people trying to implement
 custom sorting in custom FieldTypes...
 1) QueryComponent.doFieldSortValues - this definitely seems like it should
 be based on the FieldType, not an instanceof BytesRef check (oddly: the
 comment event suggestsion that it should be using the FieldType's
 indexedToReadable() method -- but it doesn't do that.  If it did, then
 this part of hte logic should work for you as long as your custom
 FieldType implemented indexedToReadable in a sane way.
 2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
 needs filled.  I'm guessing the sanest thing to do in the CUSTOM case
 would be to ask the FieldComparatorSource (which should be coming from the
 SortField that the custom FieldType produced) to create a FieldComparator
 (via newComparator - the numHits  sortPos could be anything) and then
 wrap that up in a Comparator facade that delegates to
 FieldComparator.compareValues
 That way a custom FieldType could be in complete control of the sort
 comparisons (even when merging ids).
 ...But as i said: i may be missing something, i'm not super familia with
 that code.  Please try it out and let us know if thta works -- either way
 please open a Jira pointing out the problems trying to implement
 distributed sorting in a custom FieldType.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5354) Distributed sort is broken with CUSTOM FieldType

2013-11-01 Thread Jessica Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811450#comment-13811450
 ] 

Jessica Cheng commented on SOLR-5354:
-

{quote}
Why not just add serialize/deserialize methods to solr's FieldType.java? It 
seems like the obvious place.
{quote}

When SortField's are deserialized on the receiving end, it's no longer clear 
which FieldType the field came from. If the deserialization method depends on 
FieldType, the node responsible for the merge must also have the schema loaded, 
which might not be the case in SolrCloud. Maybe solr needs its own SortField 
too then?

 Distributed sort is broken with CUSTOM FieldType
 

 Key: SOLR-5354
 URL: https://issues.apache.org/jira/browse/SOLR-5354
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Steve Rowe
  Labels: custom, query, sort
 Attachments: SOLR-5354.patch


 We added a custom field type to allow an indexed binary field type that 
 supports search (exact match), prefix search, and sort as unsigned bytes 
 lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator 
 accomplishes what we want, and even though the name of the comparator 
 mentions UTF8, it doesn't actually assume so and just does byte-level 
 operation, so it's good. However, when we do this across different nodes, we 
 run into an issue where in QueryComponent.doFieldSortValues:
   // Must do the same conversion when sorting by a
   // String field in Lucene, which returns the terms
   // data as BytesRef:
   if (val instanceof BytesRef) {
 UnicodeUtil.UTF8toUTF16((BytesRef)val, spare);
 field.setStringValue(spare.toString());
 val = ft.toObject(field);
   }
 UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually 
 UTF8. I did a hack where I specified our own field comparator to be 
 ByteBuffer based to get around that instanceof check, but then the field 
 value gets transformed into BYTEARR in JavaBinCodec, and when it's 
 unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a 
 ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, 
 which decides to give me comparatorNatural in the else of the TODO for 
 CUSTOM, which barfs because byte[] are not Comparable...
 From Chris Hostetter:
 I'm not very familiar with the distributed sorting code, but based on your
 comments, and a quick skim of the functions you pointed to, it definitely
 seems like there are two problems here for people trying to implement
 custom sorting in custom FieldTypes...
 1) QueryComponent.doFieldSortValues - this definitely seems like it should
 be based on the FieldType, not an instanceof BytesRef check (oddly: the
 comment event suggestsion that it should be using the FieldType's
 indexedToReadable() method -- but it doesn't do that.  If it did, then
 this part of hte logic should work for you as long as your custom
 FieldType implemented indexedToReadable in a sane way.
 2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
 needs filled.  I'm guessing the sanest thing to do in the CUSTOM case
 would be to ask the FieldComparatorSource (which should be coming from the
 SortField that the custom FieldType produced) to create a FieldComparator
 (via newComparator - the numHits  sortPos could be anything) and then
 wrap that up in a Comparator facade that delegates to
 FieldComparator.compareValues
 That way a custom FieldType could be in complete control of the sort
 comparisons (even when merging ids).
 ...But as i said: i may be missing something, i'm not super familia with
 that code.  Please try it out and let us know if thta works -- either way
 please open a Jira pointing out the problems trying to implement
 distributed sorting in a custom FieldType.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5354) Distributed sort is broken with CUSTOM FieldType

2013-11-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811461#comment-13811461
 ] 

Robert Muir commented on SOLR-5354:
---

{quote}
When SortField's are deserialized on the receiving end, it's no longer clear 
which FieldType the field came from. If the deserialization method depends on 
FieldType, the node responsible for the merge must also have the schema loaded, 
which might not be the case in SolrCloud.
{quote}

Then where is it getting a comparator from? I don't understand how changing a 
lucene API solves this problem.

 Distributed sort is broken with CUSTOM FieldType
 

 Key: SOLR-5354
 URL: https://issues.apache.org/jira/browse/SOLR-5354
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Steve Rowe
  Labels: custom, query, sort
 Attachments: SOLR-5354.patch


 We added a custom field type to allow an indexed binary field type that 
 supports search (exact match), prefix search, and sort as unsigned bytes 
 lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator 
 accomplishes what we want, and even though the name of the comparator 
 mentions UTF8, it doesn't actually assume so and just does byte-level 
 operation, so it's good. However, when we do this across different nodes, we 
 run into an issue where in QueryComponent.doFieldSortValues:
   // Must do the same conversion when sorting by a
   // String field in Lucene, which returns the terms
   // data as BytesRef:
   if (val instanceof BytesRef) {
 UnicodeUtil.UTF8toUTF16((BytesRef)val, spare);
 field.setStringValue(spare.toString());
 val = ft.toObject(field);
   }
 UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually 
 UTF8. I did a hack where I specified our own field comparator to be 
 ByteBuffer based to get around that instanceof check, but then the field 
 value gets transformed into BYTEARR in JavaBinCodec, and when it's 
 unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a 
 ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, 
 which decides to give me comparatorNatural in the else of the TODO for 
 CUSTOM, which barfs because byte[] are not Comparable...
 From Chris Hostetter:
 I'm not very familiar with the distributed sorting code, but based on your
 comments, and a quick skim of the functions you pointed to, it definitely
 seems like there are two problems here for people trying to implement
 custom sorting in custom FieldTypes...
 1) QueryComponent.doFieldSortValues - this definitely seems like it should
 be based on the FieldType, not an instanceof BytesRef check (oddly: the
 comment event suggestsion that it should be using the FieldType's
 indexedToReadable() method -- but it doesn't do that.  If it did, then
 this part of hte logic should work for you as long as your custom
 FieldType implemented indexedToReadable in a sane way.
 2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
 needs filled.  I'm guessing the sanest thing to do in the CUSTOM case
 would be to ask the FieldComparatorSource (which should be coming from the
 SortField that the custom FieldType produced) to create a FieldComparator
 (via newComparator - the numHits  sortPos could be anything) and then
 wrap that up in a Comparator facade that delegates to
 FieldComparator.compareValues
 That way a custom FieldType could be in complete control of the sort
 comparisons (even when merging ids).
 ...But as i said: i may be missing something, i'm not super familia with
 that code.  Please try it out and let us know if thta works -- either way
 please open a Jira pointing out the problems trying to implement
 distributed sorting in a custom FieldType.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5354) Distributed sort is broken with CUSTOM FieldType

2013-11-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811463#comment-13811463
 ] 

Robert Muir commented on SOLR-5354:
---

{quote}
Maybe solr needs its own SortField too then?
{quote}

OK I see it, I think solr should fix its own apis here? It could add 
FieldType[] to SortSpec or something like that.

 Distributed sort is broken with CUSTOM FieldType
 

 Key: SOLR-5354
 URL: https://issues.apache.org/jira/browse/SOLR-5354
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Steve Rowe
  Labels: custom, query, sort
 Attachments: SOLR-5354.patch


 We added a custom field type to allow an indexed binary field type that 
 supports search (exact match), prefix search, and sort as unsigned bytes 
 lexicographical compare. For sort, BytesRef's UTF8SortedAsUnicodeComparator 
 accomplishes what we want, and even though the name of the comparator 
 mentions UTF8, it doesn't actually assume so and just does byte-level 
 operation, so it's good. However, when we do this across different nodes, we 
 run into an issue where in QueryComponent.doFieldSortValues:
   // Must do the same conversion when sorting by a
   // String field in Lucene, which returns the terms
   // data as BytesRef:
   if (val instanceof BytesRef) {
 UnicodeUtil.UTF8toUTF16((BytesRef)val, spare);
 field.setStringValue(spare.toString());
 val = ft.toObject(field);
   }
 UnicodeUtil.UTF8toUTF16 is called on our byte array,which isn't actually 
 UTF8. I did a hack where I specified our own field comparator to be 
 ByteBuffer based to get around that instanceof check, but then the field 
 value gets transformed into BYTEARR in JavaBinCodec, and when it's 
 unmarshalled, it gets turned into byte[]. Then, in QueryComponent.mergeIds, a 
 ShardFieldSortedHitQueue is constructed with ShardDoc.getCachedComparator, 
 which decides to give me comparatorNatural in the else of the TODO for 
 CUSTOM, which barfs because byte[] are not Comparable...
 From Chris Hostetter:
 I'm not very familiar with the distributed sorting code, but based on your
 comments, and a quick skim of the functions you pointed to, it definitely
 seems like there are two problems here for people trying to implement
 custom sorting in custom FieldTypes...
 1) QueryComponent.doFieldSortValues - this definitely seems like it should
 be based on the FieldType, not an instanceof BytesRef check (oddly: the
 comment event suggestsion that it should be using the FieldType's
 indexedToReadable() method -- but it doesn't do that.  If it did, then
 this part of hte logic should work for you as long as your custom
 FieldType implemented indexedToReadable in a sane way.
 2) QueryComponent.mergeIds - that TODO definitely looks like a gap that
 needs filled.  I'm guessing the sanest thing to do in the CUSTOM case
 would be to ask the FieldComparatorSource (which should be coming from the
 SortField that the custom FieldType produced) to create a FieldComparator
 (via newComparator - the numHits  sortPos could be anything) and then
 wrap that up in a Comparator facade that delegates to
 FieldComparator.compareValues
 That way a custom FieldType could be in complete control of the sort
 comparisons (even when merging ids).
 ...But as i said: i may be missing something, i'm not super familia with
 that code.  Please try it out and let us know if thta works -- either way
 please open a Jira pointing out the problems trying to implement
 distributed sorting in a custom FieldType.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).

2013-11-01 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811466#comment-13811466
 ] 

Ryan Ernst commented on LUCENE-5283:


+1, I've been bitten by this before.  This looks great.

 Fail the build if ant test didn't execute any tests (everything filtered out).
 --

 Key: LUCENE-5283
 URL: https://issues.apache.org/jira/browse/LUCENE-5283
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5283.patch, LUCENE-5283.patch


 This should be an optional setting that defaults to 'false' (the build 
 proceeds).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).

2013-11-01 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811474#comment-13811474
 ] 

Dawid Weiss commented on LUCENE-5283:
-

No it doesn't, it looks ugly, ugly... :) But I don't see how it could be done 
in any other way (under how we use ant and subant calls).

 Fail the build if ant test didn't execute any tests (everything filtered out).
 --

 Key: LUCENE-5283
 URL: https://issues.apache.org/jira/browse/LUCENE-5283
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5283.patch, LUCENE-5283.patch


 This should be an optional setting that defaults to 'false' (the build 
 proceeds).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5027) Field Collapsing PostFilter

2013-11-01 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811540#comment-13811540
 ] 

David edited comment on SOLR-5027 at 11/1/13 6:30 PM:
--

I created the following unit test in TestCollapseQParserPlugin.java to 
illustrate the bug:

{code}
 ModifiableSolrParams params = new ModifiableSolrParams();
params.add(q, *:*);
params.add(fq, {!collapse field=group_s});
params.add(defType, edismax);
params.add(bf, field(test_ti));
params.add(fq,{!tag=test_ti}test_ti:5);
params.add(facet,true);
params.add(facet.field,{!ex=test_ti}test_ti);
assertQ(req(params), *[count(//doc)=1], 
//doc[./int[@name='test_ti']='5']);
{code}


was (Author: dboychuck):
I created the following unit test in TestCollapseQParserPlugin.java to 
illustrate the bug:

 ModifiableSolrParams params = new ModifiableSolrParams();
params.add(q, *:*);
params.add(fq, {!collapse field=group_s});
params.add(defType, edismax);
params.add(bf, field(test_ti));
params.add(fq,{!tag=test_ti}test_ti:5);
params.add(facet,true);
params.add(facet.field,{!ex=test_ti}test_ti);
assertQ(req(params), *[count(//doc)=1], 
//doc[./int[@name='test_ti']='5']);

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The CollapsingQParserPlugin also fully supports the QueryElevationComponent
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 be moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter

2013-11-01 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811540#comment-13811540
 ] 

David commented on SOLR-5027:
-

I created the following unit test in TestCollapseQParserPlugin.java to 
illustrate the bug:

 ModifiableSolrParams params = new ModifiableSolrParams();
params.add(q, *:*);
params.add(fq, {!collapse field=group_s});
params.add(defType, edismax);
params.add(bf, field(test_ti));
params.add(fq,{!tag=test_ti}test_ti:5);
params.add(facet,true);
params.add(facet.field,{!ex=test_ti}test_ti);
assertQ(req(params), *[count(//doc)=1], 
//doc[./int[@name='test_ti']='5']);

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The CollapsingQParserPlugin also fully supports the QueryElevationComponent
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 be moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5416) CollapsingQParserPlugin

2013-11-01 Thread David (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David updated SOLR-5416:


Description: 
Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 

{code}
 ModifiableSolrParams params = new ModifiableSolrParams();
params.add(q, *:*);
params.add(fq, {!collapse field=group_s});
params.add(defType, edismax);
params.add(bf, field(test_ti));
params.add(fq,{!tag=test_ti}test_ti:5);
params.add(facet,true);
params.add(facet.field,{!ex=test_ti}test_ti);
assertQ(req(params), *[count(//doc)=1], 
//doc[./int[@name='test_ti']='5']);
{code}

  was:
Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 

 ModifiableSolrParams params = new ModifiableSolrParams();
params.add(q, *:*);
params.add(fq, {!collapse field=group_s});
params.add(defType, edismax);
params.add(bf, field(test_ti));
params.add(fq,{!tag=test_ti}test_ti:5);
params.add(facet,true);
params.add(facet.field,{!ex=test_ti}test_ti);
assertQ(req(params), *[count(//doc)=1], 
//doc[./int[@name='test_ti']='5']);


 CollapsingQParserPlugin
 ---

 Key: SOLR-5416
 URL: https://issues.apache.org/jira/browse/SOLR-5416
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.6
Reporter: David
  Labels: group, grouping
 Fix For: 4.6, 5.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
 {code}
  ModifiableSolrParams params = new ModifiableSolrParams();
 params.add(q, *:*);
 params.add(fq, {!collapse field=group_s});
 params.add(defType, edismax);
 params.add(bf, field(test_ti));
 params.add(fq,{!tag=test_ti}test_ti:5);
 params.add(facet,true);
 params.add(facet.field,{!ex=test_ti}test_ti);
 assertQ(req(params), *[count(//doc)=1], 
 //doc[./int[@name='test_ti']='5']);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5416) CollapsingQParserPlugin

2013-11-01 Thread David (JIRA)
David created SOLR-5416:
---

 Summary: CollapsingQParserPlugin
 Key: SOLR-5416
 URL: https://issues.apache.org/jira/browse/SOLR-5416
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.6
Reporter: David
 Fix For: 4.6, 5.0


Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 

 ModifiableSolrParams params = new ModifiableSolrParams();
params.add(q, *:*);
params.add(fq, {!collapse field=group_s});
params.add(defType, edismax);
params.add(bf, field(test_ti));
params.add(fq,{!tag=test_ti}test_ti:5);
params.add(facet,true);
params.add(facet.field,{!ex=test_ti}test_ti);
assertQ(req(params), *[count(//doc)=1], 
//doc[./int[@name='test_ti']='5']);



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5084) new field type - EnumField

2013-11-01 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811683#comment-13811683
 ] 

Erick Erickson commented on SOLR-5084:
--

OK, looks good. On my Mac, I get two failures, but I also get those same 
failures in trunk without the patch.


So unless someone objects, I'll commit this over the weekend.



 new field type - EnumField
 --

 Key: SOLR-5084
 URL: https://issues.apache.org/jira/browse/SOLR-5084
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Erick Erickson
 Attachments: Solr-5084.patch, Solr-5084.patch, Solr-5084.patch, 
 Solr-5084.patch, Solr-5084.trunk.patch, Solr-5084.trunk.patch, 
 Solr-5084.trunk.patch, Solr-5084.trunk.patch, Solr-5084.trunk.patch, 
 Solr-5084.trunk.patch, Solr-5084.trunk.patch, enumsConfig.xml, 
 schema_example.xml


 We have encountered a use case in our system where we have a few fields 
 (Severity. Risk etc) with a closed set of values, where the sort order for 
 these values is pre-determined but not lexicographic (Critical is higher than 
 High). Generically this is very close to how enums work.
 To implement, I have prototyped a new type of field: EnumField where the 
 inputs are a closed predefined  set of strings in a special configuration 
 file (similar to currency.xml).
 The code is based on 4.2.1.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5302) Analytics Component

2013-11-01 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811682#comment-13811682
 ] 

Erick Erickson commented on SOLR-5302:
--

OK, I'm not quite sure how to proceed given the size of this patch. What do 
people think about this as a way forward?

I'll do all the pre-commit/ant testing stuff, basically the secretarial work 
involved in committing this to trunk. Since this is a new component, it's at 
least somewhat isolated from other bits of the code. I'll let it bake for a 
while in trunk and then merge into 4x. Since we just put 4.5.1 out (well, Mark 
did), if sometime a week or so after it's committed to trunk I merge it to 4x, 
there'll be substantial time to bake there before any 4.6 goes out.

Of course I'll look it over, but given the size it'll be mostly a surface level 
look-over. Anyone who wants to delve into details is more than welcome to...

How does that sound?

 Analytics Component
 ---

 Key: SOLR-5302
 URL: https://issues.apache.org/jira/browse/SOLR-5302
 Project: Solr
  Issue Type: New Feature
Reporter: Steven Bower
Assignee: Erick Erickson
 Attachments: SOLR-5302.patch, Search Analytics Component.pdf, 
 Statistical Expressions.pdf, solr_analytics-2013.10.04-2.patch


 This ticket is to track a replacement for the StatsComponent. The 
 AnalyticsComponent supports the following features:
 * All functionality of StatsComponent (SOLR-4499)
 * Field Faceting (SOLR-3435)
 ** Support for limit
 ** Sorting (bucket name or any stat in the bucket
 ** Support for offset
 * Range Faceting
 ** Supports all options of standard range faceting
 * Query Faceting (SOLR-2925)
 * Ability to use overall/field facet statistics as input to range/query 
 faceting (ie calc min/max date and then facet over that range
 * Support for more complex aggregate/mapping operations (SOLR-1622)
 ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, 
 median, percentiles
 ** Operations: negation, abs, add, multiply, divide, power, log, date math, 
 string reversal, string concat
 ** Easily pluggable framework to add additional operations
 * New / cleaner output format
 Outstanding Issues:
 * Multi-value field support for stats (supported for faceting)
 * Multi-shard support (may not be possible for some operations, eg median)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5416) CollapsingQParserPlugin bug with Taggign

2013-11-01 Thread David (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David updated SOLR-5416:


Summary: CollapsingQParserPlugin bug with Taggign  (was: 
CollapsingQParserPlugin)

 CollapsingQParserPlugin bug with Taggign
 

 Key: SOLR-5416
 URL: https://issues.apache.org/jira/browse/SOLR-5416
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.6
Reporter: David
  Labels: group, grouping
 Fix For: 4.6, 5.0

   Original Estimate: 48h
  Remaining Estimate: 48h

 Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
 {code}
  ModifiableSolrParams params = new ModifiableSolrParams();
 params.add(q, *:*);
 params.add(fq, {!collapse field=group_s});
 params.add(defType, edismax);
 params.add(bf, field(test_ti));
 params.add(fq,{!tag=test_ti}test_ti:5);
 params.add(facet,true);
 params.add(facet.field,{!ex=test_ti}test_ti);
 assertQ(req(params), *[count(//doc)=1], 
 //doc[./int[@name='test_ti']='5']);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5416) CollapsingQParserPlugin bug with Taggign

2013-11-01 Thread David (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David updated SOLR-5416:


Attachment: SOLR-5416.patch

I'm not sure if this is the right solution, but it is giving me the correct 
facet counts when only tagging one fq. Need to test with additional fq tags

 CollapsingQParserPlugin bug with Taggign
 

 Key: SOLR-5416
 URL: https://issues.apache.org/jira/browse/SOLR-5416
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.6
Reporter: David
  Labels: group, grouping
 Fix For: 4.6, 5.0

 Attachments: SOLR-5416.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Trying to use CollapsingQParserPlugin with facet tagging throws an exception. 
 {code}
  ModifiableSolrParams params = new ModifiableSolrParams();
 params.add(q, *:*);
 params.add(fq, {!collapse field=group_s});
 params.add(defType, edismax);
 params.add(bf, field(test_ti));
 params.add(fq,{!tag=test_ti}test_ti:5);
 params.add(facet,true);
 params.add(facet.field,{!ex=test_ti}test_ti);
 assertQ(req(params), *[count(//doc)=1], 
 //doc[./int[@name='test_ti']='5']);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1016: POMs out of sync

2013-11-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1016/

All tests passed

Build Log:
[...truncated 36391 lines...]
  [mvn] [INFO] -
  [mvn] [INFO] -
  [mvn] [ERROR] COMPILATION ERROR : 
  [mvn] [INFO] -

[...truncated 593 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5417) The ChaosMonkey tests are not causing any disruption.

2013-11-01 Thread Mark Miller (JIRA)
Mark Miller created SOLR-5417:
-

 Summary: The ChaosMonkey tests are not causing any disruption.
 Key: SOLR-5417
 URL: https://issues.apache.org/jira/browse/SOLR-5417
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller


At some point, a map keyed by core node name changed to be keyed by node name, 
so when the chaos monkey tries to get a jetty, it always fails.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5417) The ChaosMonkey tests are not causing any disruption.

2013-11-01 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5417:
--

Priority: Critical  (was: Major)

 The ChaosMonkey tests are not causing any disruption.
 -

 Key: SOLR-5417
 URL: https://issues.apache.org/jira/browse/SOLR-5417
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical

 At some point, a map keyed by core node name changed to be keyed by node 
 name, so when the chaos monkey tries to get a jetty, it always fails.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5417) The ChaosMonkey tests are not causing any disruption.

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811811#comment-13811811
 ] 

ASF subversion and git services commented on SOLR-5417:
---

Commit 1538110 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1538110 ]

SOLR-5417: The ChaosMonkey tests are not causing any disruption.

 The ChaosMonkey tests are not causing any disruption.
 -

 Key: SOLR-5417
 URL: https://issues.apache.org/jira/browse/SOLR-5417
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical

 At some point, a map keyed by core node name changed to be keyed by node 
 name, so when the chaos monkey tries to get a jetty, it always fails.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5417) The ChaosMonkey tests are not causing any disruption.

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811812#comment-13811812
 ] 

ASF subversion and git services commented on SOLR-5417:
---

Commit 1538111 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1538111 ]

SOLR-5417: The ChaosMonkey tests are not causing any disruption.

 The ChaosMonkey tests are not causing any disruption.
 -

 Key: SOLR-5417
 URL: https://issues.apache.org/jira/browse/SOLR-5417
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical

 At some point, a map keyed by core node name changed to be keyed by node 
 name, so when the chaos monkey tries to get a jetty, it always fails.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5232) SolrCloud should distribute updates via streaming rather than buffering.

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811813#comment-13811813
 ] 

ASF subversion and git services commented on SOLR-5232:
---

Commit 1538112 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1538112 ]

SOLR-5232: fix retry logic

 SolrCloud should distribute updates via streaming rather than buffering.
 

 Key: SOLR-5232
 URL: https://issues.apache.org/jira/browse/SOLR-5232
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.6, 5.0

 Attachments: SOLR-5232.patch, SOLR-5232.patch, SOLR-5232.patch, 
 SOLR-5232.patch, SOLR-5232.patch, SOLR-5232.patch


 The current approach was never the best for SolrCloud - it was designed for a 
 pre SolrCloud Solr - it also uses too many connections and threads - nailing 
 that down is likely wasted effort when we should really move away from 
 explicitly buffering docs and sending small batches per thread as we have 
 been doing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5232) SolrCloud should distribute updates via streaming rather than buffering.

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811814#comment-13811814
 ] 

ASF subversion and git services commented on SOLR-5232:
---

Commit 1538113 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1538113 ]

SOLR-5232: fix retry logic

 SolrCloud should distribute updates via streaming rather than buffering.
 

 Key: SOLR-5232
 URL: https://issues.apache.org/jira/browse/SOLR-5232
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.6, 5.0

 Attachments: SOLR-5232.patch, SOLR-5232.patch, SOLR-5232.patch, 
 SOLR-5232.patch, SOLR-5232.patch, SOLR-5232.patch


 The current approach was never the best for SolrCloud - it was designed for a 
 pre SolrCloud Solr - it also uses too many connections and threads - nailing 
 that down is likely wasted effort when we should really move away from 
 explicitly buffering docs and sending small batches per thread as we have 
 been doing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5321) Remove Facet42DocValuesFormat

2013-11-01 Thread Shai Erera (JIRA)
Shai Erera created LUCENE-5321:
--

 Summary: Remove Facet42DocValuesFormat
 Key: LUCENE-5321
 URL: https://issues.apache.org/jira/browse/LUCENE-5321
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera


The new DirectDocValuesFormat is nearly identical to Facet42DVF, except that it 
stores the addresses in direct int[] rather than PackedInts. On LUCENE-5296 we 
measured the performance of DirectDVF vs Facet42DVF and it improves perf for 
some queries and have negligible effect for others, as well as RAM consumption 
isn't much worse. We should remove Facet42DVF and use DirectDVF instead.

I also want to rename Facet46Codec to FacetCodec. There's no need to refactor 
the class whenever the default codec changes (e.g. from 45 to 46) since it 
doesn't care about the actual Codec version underneath, it only overrides the 
DVF used for the facet fields. FacetCodec should take the DVF from the app (so 
e.g. the facet/ module doesn't depend on codecs/) and be exposed more as a 
utility Codec rather than a real, versioned, Codec.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5296) Add DirectDocValuesFormat

2013-11-01 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811866#comment-13811866
 ] 

Shai Erera commented on LUCENE-5296:


+1. I opened LUCENE-5321 since I want to address FacetCodec changes in general.

 Add DirectDocValuesFormat
 -

 Key: LUCENE-5296
 URL: https://issues.apache.org/jira/browse/LUCENE-5296
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5296.patch


 Indexes values to disk but at search time it loads/accesses the values via 
 simple java arrays (i.e. no compression).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5189) Numeric DocValues Updates

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811872#comment-13811872
 ] 

ASF subversion and git services commented on LUCENE-5189:
-

Commit 1538143 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1538143 ]

LUCENE-5189: rename internal API following NumericDocValues updates

 Numeric DocValues Updates
 -

 Key: LUCENE-5189
 URL: https://issues.apache.org/jira/browse/LUCENE-5189
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5189-4x.patch, LUCENE-5189-4x.patch, 
 LUCENE-5189-no-lost-updates.patch, LUCENE-5189-renames.patch, 
 LUCENE-5189-segdv.patch, LUCENE-5189-updates-order.patch, 
 LUCENE-5189-updates-order.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, LUCENE-5189.patch, 
 LUCENE-5189.patch, LUCENE-5189_process_events.patch, 
 LUCENE-5189_process_events.patch


 In LUCENE-4258 we started to work on incremental field updates, however the 
 amount of changes are immense and hard to follow/consume. The reason is that 
 we targeted postings, stored fields, DV etc., all from the get go.
 I'd like to start afresh here, with numeric-dv-field updates only. There are 
 a couple of reasons to that:
 * NumericDV fields should be easier to update, if e.g. we write all the 
 values of all the documents in a segment for the updated field (similar to 
 how livedocs work, and previously norms).
 * It's a fairly contained issue, attempting to handle just one data type to 
 update, yet requires many changes to core code which will also be useful for 
 updating other data types.
 * It has value in and on itself, and we don't need to allow updating all the 
 data types in Lucene at once ... we can do that gradually.
 I have some working patch already which I'll upload next, explaining the 
 changes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-01 Thread Steve Rowe (JIRA)
Steve Rowe created LUCENE-5322:
--

 Summary: Clean up / simplify Maven-related Ant targets
 Key: LUCENE-5322
 URL: https://issues.apache.org/jira/browse/LUCENE-5322
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.6, 5.0


Many Maven-related Ant targets are public when they don't need to be, e.g. 
dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.

The arrangement of these targets could be simplified if the directories that 
have public entry points were minimized.

generate-maven-artifacts should be runnable from the top level and from lucene/ 
and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-01 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-5322:
---

Attachment: LUCENE-5322.patch

Patch.

Targets that don't need to be public are made private.

Each of the three versions of generate-maven-artifacts  makes sure that 
resolve, -unpack-(lucene and or solr)-tgz, and -filter-pom-templates is called. 
 Then recursive non-public -dist-maven doesn't have to worry about these things 
being done.

This patch also fixes the problem introduced by LUCENE-5217 with resolve not 
being called before get-maven-poms and filter-pom-templates.

I'll commit this to trunk shortly, then to branch_4x after LUCENE-5217 has been 
committed to branch_4x, in a few days.

 Clean up / simplify Maven-related Ant targets
 -

 Key: LUCENE-5322
 URL: https://issues.apache.org/jira/browse/LUCENE-5322
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5322.patch


 Many Maven-related Ant targets are public when they don't need to be, e.g. 
 dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
 The arrangement of these targets could be simplified if the directories that 
 have public entry points were minimized.
 generate-maven-artifacts should be runnable from the top level and from 
 lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811878#comment-13811878
 ] 

ASF subversion and git services commented on LUCENE-5322:
-

Commit 1538144 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1538144 ]

LUCENE-5322: Clean up / simplify Maven-related Ant targets

 Clean up / simplify Maven-related Ant targets
 -

 Key: LUCENE-5322
 URL: https://issues.apache.org/jira/browse/LUCENE-5322
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5322.patch


 Many Maven-related Ant targets are public when they don't need to be, e.g. 
 dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
 The arrangement of these targets could be simplified if the directories that 
 have public entry points were minimized.
 generate-maven-artifacts should be runnable from the top level and from 
 lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-01 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811879#comment-13811879
 ] 

Steve Rowe commented on LUCENE-5322:


Committed to trunk.

{{ant nightly-smoke}}, {{ant generate-maven-artifacts}} (at all three 
locations), {{ant validate-maven-artifacts}} and {{and get-maven-poms}} all 
succeed for me locally.

 Clean up / simplify Maven-related Ant targets
 -

 Key: LUCENE-5322
 URL: https://issues.apache.org/jira/browse/LUCENE-5322
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5322.patch


 Many Maven-related Ant targets are public when they don't need to be, e.g. 
 dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
 The arrangement of these targets could be simplified if the directories that 
 have public entry points were minimized.
 generate-maven-artifacts should be runnable from the top level and from 
 lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5322) Clean up / simplify Maven-related Ant targets

2013-11-01 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811879#comment-13811879
 ] 

Steve Rowe edited comment on LUCENE-5322 at 11/2/13 5:06 AM:
-

Committed to trunk.

{{ant nightly-smoke}}, {{ant generate-maven-artifacts}} (at all three 
locations), {{ant validate-maven-artifacts}} and {{ant get-maven-poms}} all 
succeed for me locally.


was (Author: steve_rowe):
Committed to trunk.

{{ant nightly-smoke}}, {{ant generate-maven-artifacts}} (at all three 
locations), {{ant validate-maven-artifacts}} and {{and get-maven-poms}} all 
succeed for me locally.

 Clean up / simplify Maven-related Ant targets
 -

 Key: LUCENE-5322
 URL: https://issues.apache.org/jira/browse/LUCENE-5322
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5322.patch


 Many Maven-related Ant targets are public when they don't need to be, e.g. 
 dist-maven and filter-pom-templates, m2-deploy-lucene-parent-pom, etc.
 The arrangement of these targets could be simplified if the directories that 
 have public entry points were minimized.
 generate-maven-artifacts should be runnable from the top level and from 
 lucene/ and solr/. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1015: POMs out of sync

2013-11-01 Thread Steve Rowe
This should be fixed by my commit for LUCENE-5322 - I can no longer
reproduce the problem locally.


On Fri, Nov 1, 2013 at 12:23 PM, Steve Rowe sar...@gmail.com wrote:

 The problem here is that the copied/filtered solr-core POM is missing a
 lot of dependencies: log4j, noggit, commons-io, httpcomponents.  These are
 added to the solr-core “classpath” in the Ant build from solrj and example
 lib/ directories.  When these are collected from the Ant build in
 preparation for filtering the POMs, only files that actually exist make it
 into the “classpath”, and apparently at the point the solr-core “classpath”
 is examined, there is nothing in the solrj/lib/ and example/lib/
 directories.

 I can reproduce this locally, if I first rm -rf $(find . -name ‘*.jar’),
 like the Jenkins build does.  Looks like when “generate-maven-artifacts”
 calls “filter-pom-templates”, it doesn’t invoke “resolve” (unlike
 “get-maven-poms”, which does).  Fix should be simple - I’ll work on it
 later today.

 Steve

 On Oct 31, 2013, at 10:50 PM, Apache Jenkins Server 
 jenk...@builds.apache.org wrote:

  Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1015/
 
  All tests passed
 
  Build Log:
  [...truncated 36479 lines...]
   [mvn] [INFO]
 -
   [mvn] [INFO]
 -
   [mvn] [ERROR] COMPILATION ERROR :
   [mvn] [INFO]
 -
 
  [...truncated 593 lines...]
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org