date:20120315

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229928#comment-13229928
]

Harley Parks commented on SOLR-2155:

Sorry, about the editing.

But Thanks For the Feedback.

so perhaps there is something wrong with my configuration, since the search
results do not return lat, long but the geohash string.

The maven build went great.

I did finally figure out how to use the geofilt function using the pt and d.

I am reindexing each time, but yes, I delete the data folder, and reindex.

I am placing the jar file into the tomcats solr/lib folder, after a restart,
and after changing solrconfig and schema, geohash string is displayed, not the
lat,long.
Schema
Field Type:
fieldType name=geohash class=solr2155.solr.schema.GeoHashField
length=12/
this is the field:
field name=GeoTagGeoHash type=geohash indexed=true stored=true
multiValued=true /

this is the info from solr/admin/ field types - GEOHASH
Field Type: geohash
Fields: GEOTAGGEOHASH
Tokenized: true
Class Name: solr2155.solr.schema.GeoHashField
Index Analyzer: org.apache.solr.analysis.TokenizerChain
Tokenizer Class: solr2155.solr.schema.GeoHashField$1
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer

Still, the query returns values like:

arr name=GeoTagGeoHash
strrw3sh9g8c6mx/str
strrw3f3xc9dnh3/str
strrw3ckbue74y7/str
/arr

so, if this is not the right, is there anything I can do to troubleshoot?

Geospatial search using geohash prefixes

Key: SOLR-2155
URL: https://issues.apache.org/jira/browse/SOLR-2155
Project: Solr
Issue Type: Improvement
Reporter: David Smiley
Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch,
GeoHashPrefixFilter.patch,
SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch,
SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip,
Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch

There currently isn't a solution in Solr for doing geospatial filtering on
documents that have a variable number of points. This scenario occurs when
there is location extraction (i.e. via a gazateer) occurring on free text.
None, one, or many geospatial locations might be extracted from any given
document and users want to limit their search results to those occurring in a
user-specified area.
I've implemented this by furthering the GeoHash based work in Lucene/Solr
with a geohash prefix based filter. A geohash refers to a lat-lon box on the
earth. Each successive character added further subdivides the box into a 4x8
(or 8x4 depending on the even/odd length of the geohash) grid. The first
step in this scheme is figuring out which geohash grid squares cover the
user's search query. I've added various extra methods to GeoHashUtils (and
added tests) to assist in this purpose. The next step is an actual Lucene
Filter, GeoHashPrefixFilter, that uses these geohash prefixes in
TermsEnum.seek() to skip to relevant grid squares in the index. Once a
matching geohash grid is found, the points therein are compared against the
user's query to see if it matches. I created an abstraction GeoShape
extended by subclasses named PointDistance... and CartesianBox to support
different queried shapes so that the filter need not care about these details.
This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3869) possible hang in UIMATypeAwareAnalyzerTest

2012-03-15 Thread Tommaso Teofili (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229939#comment-13229939
 ] 

Tommaso Teofili commented on LUCENE-3869:
-

I tried to reproduce that many times (same command/seed) but with no luck so 
far, which environment are you running Robert?

 possible hang in UIMATypeAwareAnalyzerTest
 --

 Key: LUCENE-3869
 URL: https://issues.apache.org/jira/browse/LUCENE-3869
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Robert Muir

 Just testing an unrelated patch, I was hung (with 100% cpu) in 
 UIMATypeAwareAnalyzerTest.
 I'll attach stacktrace at the moment of the hang.
 The fact we get a seed in the actual stacktraces for cases like this is 
 awesome! Thanks Dawid!
 I don't think it reproduces 100%, but I'll try beasting this seed to see if i 
 can reproduce the hang:
 should be 'ant test -Dtestcase=UIMATypeAwareAnalyzerTest 
 -Dtests.seed=-262aada3325aa87a:-44863926cf5c87e9:5c8c471d901b98bd' 
 from what I can see.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Bill Bell (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229945#comment-13229945
]

Bill Bell commented on SOLR-2155:
-

David,

What is an example URL call for multiValued field? Does geofilt work?

/select?q=*:*fq={!geofilt}sort=geodist() ascsfield=store_hashd=10

Or do we need to use gh_geofilt? like this?

/select?q=*:*fq={!gh_geofilt}sort=geodist() ascsfield=store_hashd=10

Geospatial search using geohash prefixes

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (LUCENE-3871) Check what's up with stack traces being insane.

2012-03-15 Thread Dawid Weiss (Reopened) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reopened LUCENE-3871:
-


 Check what's up with stack traces being insane.
 ---

 Key: LUCENE-3871
 URL: https://issues.apache.org/jira/browse/LUCENE-3871
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3871) Check what's up with stack traces being insane.


[ 
https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229969#comment-13229969
 ] 

Dawid Weiss commented on LUCENE-3871:
-

I dug a little deeper. The problem on ANT 1.7 is caused by broken stack 
filtering (and the root cause is an assertion inside 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner rethrowing the 
original exception. 

I will add a workaround by disabling stack filtering; full stack may be verbose 
but it is better than a broken stack.

 Check what's up with stack traces being insane.
 ---

 Key: LUCENE-3871
 URL: https://issues.apache.org/jira/browse/LUCENE-3871
 Project: Lucene - Java
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3871) Stack traces from failed tests are messed up on ANT 1.7.x

2012-03-15 Thread Dawid Weiss (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-3871:


Issue Type: Bug  (was: Task)
   Summary: Stack traces from failed tests are messed up on ANT 1.7.x  
(was: Check what's up with stack traces being insane.)

 Stack traces from failed tests are messed up on ANT 1.7.x
 -

 Key: LUCENE-3871
 URL: https://issues.apache.org/jira/browse/LUCENE-3871
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3871) Stack traces from failed tests are messed up on ANT 1.7.x

2012-03-15 Thread Dawid Weiss (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-3871.
-

   Resolution: Fixed
Fix Version/s: 3.6

 Stack traces from failed tests are messed up on ANT 1.7.x
 -

 Key: LUCENE-3871
 URL: https://issues.apache.org/jira/browse/LUCENE-3871
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3868) Thread interruptions shouldn't cause unhandled thread errors (or should they?).


[ 
https://issues.apache.org/jira/browse/LUCENE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229990#comment-13229990
 ] 

Dawid Weiss commented on LUCENE-3868:
-

This is fairly messy in trunk; threads are interrupted either at method or at 
class level (depending on sysprop). Additionally, the interruption is done like 
this:
{code}
  t.setUncaughtExceptionHandler(null);
  Thread.setDefaultUncaughtExceptionHandler(null);
  if (!t.getName().startsWith(SyncThread)) // avoid zookeeper jre 
crash
t.interrupt();
{code}
this doesn't restore default handler, may cause interference with other threads 
(which do have handlers), etc.

I'd rather fix it by switching to LUCENE-3808 where this is solved at the 
runner's level (and controlled via annotations).

 Thread interruptions shouldn't cause unhandled thread errors (or should 
 they?).
 ---

 Key: LUCENE-3868
 URL: https://issues.apache.org/jira/browse/LUCENE-3868
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.6, 4.0


 This is a result of pulling uncaught exception catching to a rule above 
 interrupt in internalTearDown(); check how it was before and restore previous 
 behavior?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3233) SolrExampleStreamingBinaryTest num results != expected exceptions (reproducible).

2012-03-15 Thread Dawid Weiss (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved SOLR-3233.
---

Resolution: Cannot Reproduce
  Assignee: Dawid Weiss

Cannot reproduce anymore with this seed, possibly fixed in between.

 SolrExampleStreamingBinaryTest num results != expected exceptions 
 (reproducible).
 -

 Key: SOLR-3233
 URL: https://issues.apache.org/jira/browse/SOLR-3233
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0


 {noformat}
 git clone -b SOLR-3220 --depth 0 g...@github.com:dweiss/lucene_solr.git
 git co 9b1efde7a4882caa9dd04556aa4b849db68081a5
 cd solr
 ant test-core -Dtests.filter=*.SolrExampleStreamingBinaryTest 
 -Dtests.filter.method=testStatistics -Drt.seed=F57E2420CEBDC955 
 -Dargs=-Dfile.encoding=UTF-8
 {noformat}
 The number of returned committed docs is invalid, this is reproducible and 
 occurs in many methods, not only in testStatistics?
 {code}
 int i=0;   // 0   1   2   3   4   5   6   7   8   9 
 int[] nums = new int[] { 23, 26, 38, 46, 55, 63, 77, 84, 92, 94 };
 for( int num : nums ) {
   SolrInputDocument doc = new SolrInputDocument();
   doc.setField( id, doc+i++ );
   doc.setField( name, doc: +num );
   doc.setField( f, num );
   server.add( doc );
 }
 server.commit();
 assertNumFound( *:*, nums.length ); //  FAILURE here. Indeed, a 
 query via web browser  shows not all docs are in?
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Solr-trunk - Build # 1794 - Still Failing

2012-03-15 Thread Martijn van Groningen (Resolved) (JIRA)

Build: https://builds.apache.org/job/Solr-trunk/1794/

1 tests failed.
FAILED:  org.apache.solr.TestDistributedSearch.testDistribSearch

Error Message:
Uncaught exception by thread: Thread[Thread-656,5,]

Stack Trace:
org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread:
 Uncaught exception by thread: Thread[Thread-656,5,]
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:60)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.RuntimeException: 
org.apache.solr.client.solrj.SolrServerException: http://localhost:20742/solr
at 
org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:374)
Caused by: org.apache.solr.client.solrj.SolrServerException: 
http://localhost:20742/solr
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:496)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:312)
at 
org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:369)
Caused by: org.apache.commons.httpclient.ConnectTimeoutException: The host did 
not accept the connection within timeout of 100 ms
at 
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:155)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:426)
... 4 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at

Re: [JENKINS] Solr-trunk - Build # 1794 - Still Failing

2012-03-15 Thread Dawid Weiss

Connection time out again.

D.

On Thu, Mar 15, 2012 at 10:10 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Solr-trunk/1794/

 1 tests failed.
 FAILED:  org.apache.solr.TestDistributedSearch.testDistribSearch

 Error Message:
 Uncaught exception by thread: Thread[Thread-656,5,]

 Stack Trace:
 org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread:
  Uncaught exception by thread: Thread[Thread-656,5,]
        at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:60)
        at 
 org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
        at org.junit.rules.RunRules.evaluate(RunRules.java:18)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
        at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
        at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
        at 
 org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20)
        at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
        at 
 org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
        at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
        at org.junit.rules.RunRules.evaluate(RunRules.java:18)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
        at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
        at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
        at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
        at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
 Caused by: java.lang.RuntimeException: 
 org.apache.solr.client.solrj.SolrServerException: http://localhost:20742/solr
        at 
 org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:374)
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 http://localhost:20742/solr
        at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:496)
        at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
        at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
        at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:312)
        at 
 org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:369)
 Caused by: org.apache.commons.httpclient.ConnectTimeoutException: The host 
 did not accept the connection within timeout of 100 ms
        at 
 org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:155)
        at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
        at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
        at 
 org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
        at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
        at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
        at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
        at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
        at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:426)
        ... 4 more
 Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
        at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
        at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
        at

[jira] [Resolved] (LUCENE-3856) Create docvalues based grouped facet collector


 [ 
https://issues.apache.org/jira/browse/LUCENE-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen resolved LUCENE-3856.
---

Resolution: Fixed

Committed to trunk.

 Create docvalues based grouped facet collector
 --

 Key: LUCENE-3856
 URL: https://issues.apache.org/jira/browse/LUCENE-3856
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-3856.patch, LUCENE-3856.patch, LUCENE-3856.patch


 Create docvalues based grouped facet collector. Currently only term based 
 collectors have been implemented (LUCENE-3802).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Uwe Schindler (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3867:
--

Attachment: LUCENE-3867.patch

Hi,
here new patch using Unsafe to get the bitness (with the well-known fallback) 
and for compressedOops detection. Looks much cleaner.
I also like it more, that the addressSize is now detected natively and not from 
sysprops.

The constants mentioned by Dawid are only availabe in Java 7, so i reflected 
the underlying methods from theUnsafe. I also changed the boolean 
JRE_USES_COMPRESSED_OOPS to an integer JRE_REFERENCE_SIZE that is used by 
RamUsageEstimator. We might do the same for all other native types... (this is 
just a start).

Shai: Can you test with your JVMs and also enable/disable compressed oops/refs?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230030#comment-13230030
 ] 

Vadim Kisselmann commented on SOLR-3238:


I use Solr 4.0 from trunk(latest) with tomcat6.

I get an error in New Admin UI (since one week, i update every day from trunk):

This interface requires that you activate the admin request handlers, add the 
following configuration to your solrconfig.xml:
!-- Admin Handlers - This will register all the standard admin 
RequestHandlers. --
requestHandler name=/admin/ class=solr.admin.AdminHandlers /

Admin request Handlers are definitely activated in my solrconfig.

 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Uwe Schindler (Issue Comment Edited) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230043#comment-13230043
 ] 

Uwe Schindler commented on SOLR-3238:
-

I got this error with trunk checkout using ant run-example, too. But only on 
the first run, later runs work.

 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230043#comment-13230043
 ] 

Uwe Schindler edited comment on SOLR-3238 at 3/15/12 10:43 AM:
---

I got this error with trunk checkout using ant run-example, too. But only on 
the first run, later runs work.

EDIT: I think this has nothing to do with the admin UI. When this happened, I 
got some Exceptions in the startup of Solr. Can you check for them in the logs? 
Unfortunately I cannot reproduce at the moment.

  was (Author: thetaphi):
I got this error with trunk checkout using ant run-example, too. But only 
on the first run, later runs work.
  
 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Issue Comment Edited) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230056#comment-13230056
 ] 

Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 10:47 AM:
--

It's weird:)

ant run-example starts server with jetty, and it works. 
As next step i build it one more time with ant example and start my tomcat, 
and it works, too.

When i update to new Solr version from trunk, and build it with ant example, 
i get this error again. 

  was (Author: ldb):
It's weird:)

ant run-example starts server with jetty, and it works. 
As next step i build it one more time with ant example and start my tomcat, 
and it works, too.

When i update to new Solr version form trunk, and build it with ant example, 
i get this error again. 
  
 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230056#comment-13230056
 ] 

Vadim Kisselmann commented on SOLR-3238:


It's weird:)

ant run-example starts server with jetty, and it works. 
As next step i build it one more time with ant example and start my tomcat, 
and it works, too.

When i update to new Solr version form trunk, and build it with ant example, 
i get this error again. 

 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Issue Comment Edited) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230056#comment-13230056
 ] 

Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 11:02 AM:
--

It's weird:)

ant run-example starts server with jetty, and it works. 
As next step i build it one more time with ant example and start my tomcat, 
and it works, too.

When i update to new Solr version from trunk, and build it with ant example, 
i get this error again. 

EDIT: no errors at this time in my log files.


  was (Author: ldb):
It's weird:)

ant run-example starts server with jetty, and it works. 
As next step i build it one more time with ant example and start my tomcat, 
and it works, too.

When i update to new Solr version from trunk, and build it with ant example, 
i get this error again. 
  
 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3869) possible hang in UIMATypeAwareAnalyzerTest


[ 
https://issues.apache.org/jira/browse/LUCENE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230069#comment-13230069
 ] 

Robert Muir commented on LUCENE-3869:
-

linux amd64. I can try to dig into this, and ill try upgrading my jvm etc too, 
its a bit outdated :)

 possible hang in UIMATypeAwareAnalyzerTest
 --

 Key: LUCENE-3869
 URL: https://issues.apache.org/jira/browse/LUCENE-3869
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Robert Muir

 Just testing an unrelated patch, I was hung (with 100% cpu) in 
 UIMATypeAwareAnalyzerTest.
 I'll attach stacktrace at the moment of the hang.
 The fact we get a seed in the actual stacktraces for cases like this is 
 awesome! Thanks Dawid!
 I don't think it reproduces 100%, but I'll try beasting this seed to see if i 
 can reproduce the hang:
 should be 'ant test -Dtestcase=UIMATypeAwareAnalyzerTest 
 -Dtests.seed=-262aada3325aa87a:-44863926cf5c87e9:5c8c471d901b98bd' 
 from what I can see.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12754 - Failure

2012-03-15 Thread Vadim Kisselmann (Issue Comment Edited) (JIRA)

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12754/

All tests passed

Build Log (for compile errors):
[...truncated 15069 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 12754 - Failure

2012-03-15 Thread Martijn v Groningen

I committed a fix for this.

Martijn

On 15 March 2012 12:11, Apache Jenkins Server jenk...@builds.apache.orgwrote:

 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12754/

 All tests passed

 Build Log (for compile errors):
 [...truncated 15069 lines...]




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Met vriendelijke groet,

Martijn van Groningen

[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104
 ] 

Vadim Kisselmann commented on SOLR-3238:


Now i have error messages:

SCHWERWIEGEND: The web application [/solr2] appears to have started a thread 
named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very 
likely to create a memory leak.
Exception in thread Thread-2 java.lang.NullPointerException
at 
org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179)
at 
org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104)
at java.lang.Thread.run(Thread.java:662)
15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped already.  
Could not load org.apache.zookeeper.server.ZooTrace.  The eventual following 
stack trace is caused by an error thrown for debugging purposes as well as to 
attempt to terminate the thread which caused the illegal access, and has no 
functional impact.
java.lang.IllegalStateException
at 
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531)
at 
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196)
15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy

SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with 
[1.6]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.
log4j:WARN No appenders could be found for logger 
(org.apache.solr.core.SolrResourceLoader).
log4j:WARN Please initialize the log4j system properly.


Steps:
I deleted the one default core in solr.xml, because i would create new cores 
with CoreAdminHandler.
I started tomcat.

 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI

[
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104
]

Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 12:44 PM:
--

Now i have error messages:

SCHWERWIEGEND: The web application [/solr2] appears to have started a thread
named [main-SendThread(zookeeper:2181)] but has failed to stop it. This is very
likely to create a memory leak.
Exception in thread Thread-2 java.lang.NullPointerException
at
org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179)
at
org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104)
at java.lang.Thread.run(Thread.java:662)
15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped already.
Could not load org.apache.zookeeper.server.ZooTrace. The eventual following
stack trace is caused by an error thrown for debugging purposes as well as to
attempt to terminate the thread which caused the illegal access, and has no
functional impact.
java.lang.IllegalStateException
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531)
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196)
15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy

SLF4J: The requested version 1.5.8 by your slf4j binding is not compatible with
[1.6]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.
log4j:WARN No appenders could be found for logger
(org.apache.solr.core.SolrResourceLoader).
log4j:WARN Please initialize the log4j system properly.

Steps:
I deleted the one default core in solr.xml, because i would create new cores
with CoreAdminHandler.
I started tomcat.
Now it's completely broken. Rebuild and restart, whether jetty or tomcat,
change nothing.

was (Author: ldb):
Now i have error messages:

Steps:
I deleted the one default core in solr.xml, because i would create new cores
with CoreAdminHandler.
I started tomcat.

Sequel of Admin UI
--

Key: SOLR-3238
URL: https://issues.apache.org/jira/browse/SOLR-3238
Project: Solr
Issue Type: Improvement
Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
Fix For: 4.0

Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Issue Comment Edited) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104
]

Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 12:53 PM:
--

Now i have error messages:

EDIT:
i get the same problem on another server(tomcat, sharded, without ZK). With an
old revision from Feb. it works,
with new checkout from trunk not.

was (Author: ldb):
Now i have error messages:

Sequel of Admin UI
--

Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

-
To

AW: AW: AW: AW: Setting Stopword Set in PyLucene (or using Set in general)

2012-03-15 Thread Thomas Koch

Hi,
I have to add a comment to my previous mail:

 I'd preferred using this option (#2) in toArray (for both JavaList and
 JavaSet) as it does not require the wrapping into  Java Integer (etc.)
objects.
 However this method does not work with lucene.ArrayList:
 
   x=lucene. JArray ('int')([1,2])
  JArrayint[1, 2]
   y=lucene. ArrayList (x)
  Traceback: lucene.InvalidArgsError:
   (type 'ArrayList', '__init__', (JArrayint[1, 2],))
 
Sorry - that's rubbish of course: ArrayList requires a collection in its
constructor and JArray isn't a collection. So this can't work! The
'challenge' was to be able to use JavaSet and/or JavaList (both are
collections) as an argument for ArrayList. (During init of ArrayList the
toArray() method is called however.)

So I gave it a quick try again, and tried the 2nd alternative:

 1) return JArray(object)([lucene.Integer()-object*])
 or
 2) return JArray(int)([Python-int-literal*])

but that option then fails (in the demo code) when using bool (or float)
types. Attached is a revised version of collections.py with the alternative
code (disabled) - if anyone's interested...

The mentioned issue with the created JArray containing the same objects
still remains. I'll have to look deeper into that, but as said I'm out of
office next week ...

BTW, sorry if this is out of scope of the PyLucene mailing list (it's more a
JCC related discussion) - we can continue with 'private' mail if that's
preferred. 

Regards,
Thomas
#   Licensed under the Apache License, Version 2.0 (the License);
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an AS IS BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

from lucene import JArray, Boolean, Float, Integer, Long, String, \
PythonSet, PythonList, PythonIterator, PythonListIterator, JavaError, \
NoSuchElementException, IllegalStateException, IndexOutOfBoundsException

# 1 wrap via lucene.Integer etc. (string not needed) - works (almost...)
_types = { bool: Boolean,
  float: Float,
int: Integer,
   long: Long,
 }

# TODO: Remove this code ...
# 2 used typed JArray: works for string, but not float and bool
X_types = {  int: 'int',
   bool: 'bool',
  float: 'float',
   long: 'long',
str: 'string'
 }


class JavaSet(PythonSet):

This class implements java.util.Set around a Python set instance it wraps.


def __init__(self, _set):
super(JavaSet, self).__init__()
self._set = _set

def __contains__(self, obj):
return obj in self._set

def __len__(self):
return len(self._set)

def __iter__(self):
return iter(self._set)

def add(self, obj):
if obj not in self._set:
self._set.add(obj)
return True
return False

def addAll(self, collection):
size = len(self._set)
self._set.update(collection)
return len(self._set)  size

def clear(self):
self._set.clear()

def contains(self, obj):
return obj in self._set

def containsAll(self, collection):
for obj in collection:
if obj not in self._set:
return False
return True

def equals(self, collection):
if type(self) is type(collection):
return self._set == collection._set
return False

def isEmpty(self):
return len(self._set) == 0

def iterator(self):
class _iterator(PythonIterator):
def __init__(_self):
super(_iterator, _self).__init__()
_self._iterator = iter(self._set)
def hasNext(_self):
if hasattr(_self, '_next'):
return True
try:
_self._next = _self._iterator.next()
return True
except StopIteration:
return False
def next(_self):
if hasattr(_self, '_next'):
next = _self._next
del _self._next
else:
next = _self._iterator.next()
return next
return _iterator()

def remove(self, obj):
try:
self._set.remove(obj)
return True
except KeyError:
return False

def removeAll(self, collection):
result = False
for obj in collection:
try:
self._set.remove(obj)
result = True
except KeyError:
pass
return result

def retainAll(self, collection):

[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Shai Erera (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3867:
---

Attachment: LUCENE-3867.patch

Thanks Uwe !

I ran the test, and now with both J9 (IBM) and Oracle, I get this print 
(without enabling any flag):

{code}
[junit] NOTE: running test testReferenceSize
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
{code}

* I modified the test name to testReferenceSize (was testCompressedOops).

I wrote this small test to print the differences between sizeOf(String) and 
estimateRamUsage(String):

{code}
  public void testSizeOfString() throws Exception {
String s = abcdefgkjdfkdsjdskljfdskfjdsf;
String sub = s.substring(0, 4);
System.out.println(original= + RamUsageEstimator.sizeOf(s));
System.out.println(sub= + RamUsageEstimator.sizeOf(sub));
System.out.println(checkInterned=true(orig):  + new 
RamUsageEstimator().estimateRamUsage(s));
System.out.println(checkInterned=false(orig):  + new 
RamUsageEstimator(false).estimateRamUsage(s));
System.out.println(checkInterned=false(sub):  + new 
RamUsageEstimator(false).estimateRamUsage(sub));
  }
{code}

It prints:
{code}
original=104
sub=56
checkInterned=true(orig): 0
checkInterned=false(orig): 98
checkInterned=false(sub): 98
{code}

So clearly estimateRamUsage factors in the sub-string's larger char[]. The 
difference in sizes of 'orig' stem from AverageGuessMemoryModel which computes 
the reference size to be 4 (hardcoded), and array size to be 16 (hardcoded). I 
modified AverageGuess to use constants from RUE (they are best guesses 
themselves). Still the test prints a difference, but now I think it's because 
sizeOf(String) aligns the size to mod 8, while estimateRamUsage isn't. I fixed 
that in size(Object), and now the prints are the same.

* I also fixed sizeOfArray -- if the array.length == 0, it returned 0, but it 
should return its header, and aligned to mod 8 as well.

* I modified sizeOf(String[]) to sizeOf(Object[]) and compute its raw size 
only. I started to add sizeOf(String), fastSizeOf(String) and 
deepSizeOf(String[]), but reverted to avoid the hassle -- the documentation 
confuses even me :).

* Changed all sizeOf() to return long, and align() to take and return long.

I think this is ready to commit, though I'd appreciate a second look on the 
MemoryModel and size(Obj) changes.

Also, how about renaming MemoryModel methods to: arrayHeaderSize(), 
classHeaderSize(), objReferenceSize() to make them more clear and accurate? For 
instance, getArraySize does not return the size of an array, but its object 
header ...

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String

[jira] [Issue Comment Edited] (SOLR-3238) Sequel of Admin UI

2012-03-15 Thread Vadim Kisselmann (Issue Comment Edited) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230104#comment-13230104
]

Vadim Kisselmann edited comment on SOLR-3238 at 3/15/12 1:14 PM:
-

Now i have error messages:

EDIT:
i get the same problem on another server(tomcat, sharded, without ZK). With an
old revision from Feb. it works,
with new checkout from trunk not.

EDIT2:
It works when i remove the example-folder, checkout new version from trunk and
rebuild it. i think it's a problem
with solr.xml. On server-restart it breaks. With older revisions like r1292064
from Feb. it works.
I think you're right, this has nothing to do with the admin UI. Sorry for spam
here.
New Issue?

was (Author: ldb):
Now i have error messages:

EDIT:
i get the same problem on another server(tomcat, sharded, without ZK). With an
old revision from Feb. it works,
with new checkout from trunk not.

Sequel of Admin UI
--

Test weirdness

2012-03-15 Thread Erick Erickson

I've seen a couple of notes this morning about running/test weirdness.
I was getting an error when testing in
TestSystemPropertiesInvariantsRule that made no sense whatsoever, with
no changes to the code. Even after the usual tricks (updating a couple
of times, running 'ant clean' all that jazz).

Error was about new-property-1  new-value-1 existing, which makes
some sense, that's the test. but what made no sense is that it was
suddenly failing.

Deleted the entire tree and did a fresh checkout from the repo and the
problems vanished. Bad magic, but it worked. Mac OS X.

FWIW

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230171#comment-13230171
 ] 

Dawid Weiss commented on LUCENE-3867:
-

-1 to mixing shallow and deep sizeofs -- sizeOf(Object[] arr) is shallow and 
just feels wrong to me. All the other methods yield the deep total, why make an 
exception? If anything, make it explicit and then do it for any type of object 
-- 

{code}
shallowSizeOf(Object t);
sizeOf(Object t);
{code}

I'm not complaining just because my sense of taste is feeling bad. I am 
actually using this class in my own projects and I would hate to look into the 
JavaDoc every time to make sure what a given method does (especially with 
multiple overrides). In other words, I would hate to see this:

{code}
Object [] o1 = new Object [] {1, 2, 3};
Object o2 = o1;
if (sizeOf(o1) != sizeOf(o2)) throw new WtfException();
{code}





 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230181#comment-13230181
 ] 

Uwe Schindler commented on LUCENE-3867:
---

{quote}

I ran the test, and now with both J9 (IBM) and Oracle, I get this print 
(without enabling any flag):

{code}
[junit] NOTE: running test testReferenceSize
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
{code}
{quote}

I hope with compressedOops explicitely enabled (or however they call them), you 
get a reference size of 4 in J9 and pre-1.6.0_23 Oracle?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Shai Erera (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3867:
---

Attachment: LUCENE-3867.patch

Ok removed sizeOf(Object[]). One can compute it by using RUE.estimateRamSize to 
do a deep calculation.

Geez Dawid, you took away all the reasons I originally opened the issue for ;).

But at least AvgGuessMemoryModel and RUE.size() are more accurate now. And we 
have some useful utility methods.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Shai Erera (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230187#comment-13230187
 ] 

Shai Erera commented on LUCENE-3867:


I ran ant test-core -Dtestcase=TestRam* -Dtests.verbose=true 
-Dargs=-XX:+UseCompressedOops and ant test-core -Dtestcase=TestRam* 
-Dtests.verbose=true -Dargs=-XX:-UseCompressedOops and get 8 and 4 (with 
CompressedOops).

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Mark Miller (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230202#comment-13230202
 ] 

Mark Miller commented on LUCENE-3867:
-

Oh, bummer - looks like we lost the whole history of this class...such a 
bummer. I really wanted to take a look at how this class had evolved since I 
last looked at it. I've missed the conversations around the history loss - is 
that gone, gone, gone, or is there still some way to find it?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Mark Miller (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230206#comment-13230206
 ] 

Mark Miller commented on LUCENE-3867:
-

Scratch that - I was trying to look back from the apache git clone using git - 
assumed it's history matched svn - but I get a clean full history using svn.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230208#comment-13230208
 ] 

Uwe Schindler commented on LUCENE-3867:
---

Die, GIT, die! :-) (as usual)

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3250) Dynamic Field capabilities based on value not name

2012-03-15 Thread Grant Ingersoll (Created) (JIRA)

Dynamic Field capabilities based on value not name
--

 Key: SOLR-3250
 URL: https://issues.apache.org/jira/browse/SOLR-3250
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll


In some situations, one already knows the schema of their content, so having to 
declare a schema in Solr becomes cumbersome in some situations.  For instance, 
if you have all your content in JSON (or can easily generate it) or other typed 
serializations, then you already have a schema defined.  It would be nice if we 
could have support for dynamic fields that used whatever name was passed in, 
but then picked the appropriate FieldType for that field based on the value of 
the content.  So, for instance, if the input is a number, it would select the 
appropriate numeric type.  If it is a plain text string, it would pick the 
appropriate text field (you could even add in language detection here).  If it 
is comma separated, it would treat them as keywords, etc.  Also, we could 
likely send in a hint as to the type too.

With this approach, you of course have a first in wins situation, but 
assuming you have this schema defined elsewhere, it is likely fine.

Supporting such cases would allow us to be schemaless when appropriate, while 
offering the benefits of schemas when appropriate.  Naturally, one could mix 
and match these too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1983 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1983/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=91 closes=90

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=91 
closes=90
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner$3.addError(JUnitTestRunner.java:974)
at junit.framework.TestResult.addError(TestResult.java:38)
at 
junit.framework.JUnit4TestAdapterCache$1.testFailure(JUnit4TestAdapterCache.java:51)
at 
org.junit.runner.notification.RunNotifier$4.notifyListener(RunNotifier.java:100)
at 
org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:41)
at 
org.junit.runner.notification.RunNotifier.fireTestFailure(RunNotifier.java:97)
at 
org.junit.internal.runners.model.EachTestNotifier.addFailure(EachTestNotifier.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:306)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=91 closes=90
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:211)
at 
org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
... 4 more




Build Log (for compile errors):
[...truncated 11141 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230214#comment-13230214
 ] 

Uwe Schindler commented on LUCENE-3867:
---

bq. I ran ant test-core -Dtestcase=TestRam* -Dtests.verbose=true 
-Dargs=-XX:+UseCompressedOops and ant test-core -Dtestcase=TestRam* 
-Dtests.verbose=true -Dargs=-XX:-UseCompressedOops and get 8 and 4 (with 
CompressedOops).

OK, thanks. So it seems to work at least with Oracle/Sun and IBM J9. I have no 
other updates to this detection code.

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230219#comment-13230219
 ] 

Dawid Weiss commented on LUCENE-3867:
-

bq. Geez Dawid, you took away all the reasons I originally opened the issue for 

This is by no means wasted time. I think the improvements are clear?

bq. Die, GIT, die!

I disagree here -- git is a great tool, even if the learning curve may be steep 
at first. git-svn is a whole different story (it's a great hack but just a 
hack).

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230222#comment-13230222
 ] 

Uwe Schindler commented on LUCENE-3867:
---

bq. I disagree here

Calm down, was just my well-known standard answer :-)

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Shai Erera (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230229#comment-13230229
 ] 

Shai Erera commented on LUCENE-3867:


bq. This is by no means wasted time. I think the improvements are clear?

Yes, yes. It was a joke.

Ok so can I proceed with the commit, or does someone intend to review the patch 
later?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230228#comment-13230228
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Oh, I am calm, I just know people do hate git (and I used to as well, until I 
started using it frequently). Robert has a strong opinion about git, for 
example. 

Besides, there's nothing wrong in having a strong opinion -- it's great people 
can choose what they like and still collaborate via patches (and this seems to 
be the common ground between all vcs's).

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230230#comment-13230230
 ] 

Uwe Schindler commented on LUCENE-3867:
---

With unsafe we also get all those information like size of array header we have 
hardcoded. Should we not try to get these in the same way like I did for 
bitness and reference size - using Unsafe.theUnsafe.arrayBaseOffset()? And 
fallback to our hardcoded defaults?

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3250) Dynamic Field capabilities based on value not name

2012-03-15 Thread Grant Ingersoll (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230231#comment-13230231
]

Grant Ingersoll commented on SOLR-3250:
---

Note, a core reload is not something I would want to do.

Dynamic Field capabilities based on value not name
--

Key: SOLR-3250
URL: https://issues.apache.org/jira/browse/SOLR-3250
Project: Solr
Issue Type: Improvement
Reporter: Grant Ingersoll

In some situations, one already knows the schema of their content, so having
to declare a schema in Solr becomes cumbersome in some situations. For
instance, if you have all your content in JSON (or can easily generate it) or
other typed serializations, then you already have a schema defined. It would
be nice if we could have support for dynamic fields that used whatever name
was passed in, but then picked the appropriate FieldType for that field based
on the value of the content. So, for instance, if the input is a number, it
would select the appropriate numeric type. If it is a plain text string, it
would pick the appropriate text field (you could even add in language
detection here). If it is comma separated, it would treat them as keywords,
etc. Also, we could likely send in a hint as to the type too.
With this approach, you of course have a first in wins situation, but
assuming you have this schema defined elsewhere, it is likely fine.
Supporting such cases would allow us to be schemaless when appropriate, while
offering the benefits of schemas when appropriate. Naturally, one could mix
and match these too.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230235#comment-13230235
 ] 

Dawid Weiss commented on LUCENE-3867:
-

bq. using Unsafe.theUnsafe.arrayBaseOffset()? And fallback to our hardcoded 
defaults?

+1. 

I will also try on OpenJDK with various jits but I'll do it in the evening.

bq. Yes, yes. It was a joke.

Joke or no joke the truth is I did complain a lot. :)


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3250) Dynamic Field capabilities based on value not name

2012-03-15 Thread Martijn van Groningen (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230243#comment-13230243
]

Yonik Seeley commented on SOLR-3250:

Of course hopefully everyone knows schemaless is mostly marketing b.s. - when
people do this, there is still a schema, but it's guessed on first use (and
hence generally a horrible idea for production systems).

It would be easy enough on a single node... but how does one handle a cluster?
Say you index price=0 on nodeA, and price=100.0 on nodeB?

A quick thought on how it might work:
- have a separate file auto_fields.json that keeps track of the mappings that
would be the same for all cores using that schema
- when we run across a field we haven't seen before, we must guess a type for
it, then grab a lock - update the auto_fields.json
- we can update our in-memory schema with any new fields we find in
auto_fields.json
- works the same in ZK mode... it's just the auto_fields.json is in ZK, and we
would use something like optimistic locking to update it

Dynamic Field capabilities based on value not name
--

Key: SOLR-3250
URL: https://issues.apache.org/jira/browse/SOLR-3250
Project: Solr
Issue Type: Improvement
Reporter: Grant Ingersoll

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()

2012-03-15 Thread Michael McCandless (Created) (JIRA)

Index changes are lost if you call prepareCommit() then close()
---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0


You are supposed to call commit() after calling prepareCommit(), but... if you 
forget, and call close() after prepareCommit() without calling commit(), then 
any changes done after the prepareCommit() are silently lost (including 
adding/deleting docs, but also any completed merges).

Spinoff from java-user thread lots of .cfs (compound files) in the index 
directory from Tim Bogaert.

I think to fix this, IW.close should throw an IllegalStateException if 
prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()

2012-03-15 Thread Michael McCandless (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3872:
---

Attachment: LUCENE-3872.patch

Patch w/ failing test showing how we silently lose indexed docs...

 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3778) Create a grouping convenience class


 [ 
https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3778:
--

Fix Version/s: 4.0

 Create a grouping convenience class
 ---

 Key: LUCENE-3778
 URL: https://issues.apache.org/jira/browse/LUCENE-3778
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-3778.patch, LUCENE-3778.patch


 Currently the grouping module has many collector classes with a lot of 
 different options per class. I think it would be a good idea to have a 
 GroupUtil (Or another name?) convenience class. I think this could be a 
 builder, because of the many options 
 (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations 
 (term/dv/function) grouping has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3207) Add field name validation

2012-03-15 Thread Luca Cavanna (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated SOLR-3207:
---

Attachment: SOLR-3207.patch

First draft patch. I introduced a new FieldNameValidator class which is used 
within  the IndexSchema class to validate every field name. The new class 
exposes also some boolean methods which are used within the ReturnFields class, 
in order to apply the same rules there to detect a field name. That's needed to 
make sure that we accept field names that we can handle within the fl parameter.

Apparently, if you use a placeholder as field name you receive on IndexSchema 
the default value, which can be empty. That's why I'm allowing empty field 
names. I'm not even sure I understood correctly how placeholders work, can 
someone help me out with this?

Let me know what you think about my patch!

 Add field name validation
 -

 Key: SOLR-3207
 URL: https://issues.apache.org/jira/browse/SOLR-3207
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Luca Cavanna
 Fix For: 4.0

 Attachments: SOLR-3207.patch


 Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would 
 be useful to add some kind of validation regarding the field names you can 
 use on Solr.
 The objective would be adding consistency, allowing only field names that you 
 can then use within fl, sorting etc.
 The rules, taken from the actual StrParser behaviour, seem to be the 
 following: 
 - same used for java identifiers (Character#isJavaIdentifierPart), plus the 
 use of trailing '.' and '-'
 - for the first character the rule is Character#isJavaIdentifierStart minus 
 '$' (The dash can't be used as first character (SOLR-3191) for example)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3778) Create a grouping convenience class

2012-03-15 Thread Martijn van Groningen (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3778:
--

Attachment: LUCENE-3778.patch

Updated patch.
* Changed package.html
* The methods that set something now have a name that begins with set.

 Create a grouping convenience class
 ---

 Key: LUCENE-3778
 URL: https://issues.apache.org/jira/browse/LUCENE-3778
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-3778.patch, LUCENE-3778.patch, LUCENE-3778.patch


 Currently the grouping module has many collector classes with a lot of 
 different options per class. I think it would be a good idea to have a 
 GroupUtil (Or another name?) convenience class. I think this could be a 
 builder, because of the many options 
 (sort,sortWithinGroup,groupOffset,groupCount and more) and implementations 
 (term/dv/function) grouping has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()

2012-03-15 Thread Michael McCandless (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3872:
---

Attachment: LUCENE-3872.patch

Patch, I think it's ready.

One test was failing to call the commit() matching its prepareCommit()... I 
fixed it.

 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch, LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3207) Add field name validation


[ 
https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230287#comment-13230287
 ] 

Yonik Seeley commented on SOLR-3207:


bq. same used for java identifiers (Character#isJavaIdentifierPart), plus the 
use of trailing '.' and '-'

I think we should prob define it as I documented in the schema:

   !-- field names should consist of alphanumeric or underscore characters 
only and
  not start with a digit.  This is not currently strictly enforced,
  but other field names will not have first class support from all 
components
  and back compatibility is not guaranteed.
   --


 Add field name validation
 -

 Key: SOLR-3207
 URL: https://issues.apache.org/jira/browse/SOLR-3207
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Luca Cavanna
 Fix For: 4.0

 Attachments: SOLR-3207.patch


 Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would 
 be useful to add some kind of validation regarding the field names you can 
 use on Solr.
 The objective would be adding consistency, allowing only field names that you 
 can then use within fl, sorting etc.
 The rules, taken from the actual StrParser behaviour, seem to be the 
 following: 
 - same used for java identifiers (Character#isJavaIdentifierPart), plus the 
 use of trailing '.' and '-'
 - for the first character the rule is Character#isJavaIdentifierStart minus 
 '$' (The dash can't be used as first character (SOLR-3191) for example)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230289#comment-13230289
]

Harley Parks commented on SOLR-2155:

Bill:
the doc's on queryParser state the name of the function can then be used as the
main query, gh_geofilt, perhaps something like: /select?q={!gh_geofilt}... but,
good question.
geofilt is working for me on multivalued fields.

my issue is the query result returns the geohash string, not the geohash lat,
long.
In building the v 1.0.3 jar file for solr2155, I used jdk 6. as I didn't see
any errors, so hopefully, that's fine.
so, I'm going to see if solr 3.5 will perhaps resolve my issue.

Geospatial search using geohash prefixes

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()


[ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230293#comment-13230293
 ] 

Robert Muir commented on LUCENE-3872:
-

I don't have a better fix, but at the same time i feel you should be able to 
close() at any time,
(such as when handling exceptions in your app), since we are a real closeable 
here.


 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch, LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()

2012-03-15 Thread Michael McCandless (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230307#comment-13230307
 ] 

Michael McCandless commented on LUCENE-3872:


Well, we could also easily allow skipping the call to commit... in this case 
IW.close would detect the missing call to commit, call commit, and call commit 
again to save any changes done after the prepareCommit and before close.


 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch, LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3848) basetokenstreamtestcase should fail if tokenstream starts with posinc=0


 [ 
https://issues.apache.org/jira/browse/LUCENE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3848:


Attachment: LUCENE-3848.patch

updated patch: I think its ready to commit.

I didn't integrate Mike's nice MockGraphTokenFilter *yet* but will do this 
under a separate issue: its likely to expose a few bugs :)

 basetokenstreamtestcase should fail if tokenstream starts with posinc=0
 ---

 Key: LUCENE-3848
 URL: https://issues.apache.org/jira/browse/LUCENE-3848
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3848-MockGraphTokenFilter.patch, 
 LUCENE-3848.patch, LUCENE-3848.patch


 This is meaningless for a tokenstream to start with posinc=0,
 Its also caused problems and hairiness in the indexer (LUCENE-1255, 
 LUCENE-1542),
 and it makes senseless tokenstreams. We should add a check and fix any that 
 do this.
 Furthermore the same bug can exist in removing-filters if they have 
 enablePositionIncrements=false.
 I think this option is useful: but it shouldnt mean 'allow broken 
 tokenstream', it just means we
 don't add gaps. 
 If you remove tokens with enablePositionIncrements=false it should not cause 
 the TS to start with
 positionincrement=0, and it shouldnt 'restructure' the tokenstream (e.g. 
 moving synonyms on top of a different word).
 It should just not add any 'holes'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 1984 - Still Failing

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/1984/

1 tests failed.
FAILED:  
org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest.testMultiThreaded

Error Message:
Uncaught exception by thread: Thread[DocThread-1,5,]

Stack Trace:
org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread:
 Uncaught exception by thread: Thread[DocThread-1,5,]
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:60)
at 
org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:20)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.AssertionError: DocThread-1---http://localhost:17096/solr
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.client.solrj.LargeVolumeTestBase$DocThread.run(LargeVolumeTestBase.java:120)




Build Log (for compile errors):
[...truncated 10124 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3196) partialResults response header not propagated in distributed search

2012-03-15 Thread Erick Erickson (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-3196.
--

   Resolution: Fixed
Fix Version/s: 4.0
   3.6

4.x r: 1301097
3.6 r: 1301096

Right, after looking a bit more closely, there's nothing here that would break 
back-compat, that was just my paranoia was at work.

 partialResults response header not propagated in distributed search
 ---

 Key: SOLR-3196
 URL: https://issues.apache.org/jira/browse/SOLR-3196
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 3.5, 4.0
Reporter: Russell Black
  Labels: patch
 Fix For: 3.6, 4.0

 Attachments: SOLR-3196-3x.patch, SOLR-3196-partialResults-header.patch


 For {{timeAllowed=true}} requests, the response contains a {{partialResults}} 
 header that indicates when a search was terminated early due to running out 
 of time.  This header is being discarded by the collator.  Patch to follow.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3873) tie MockGraphTokenFilter into all analyzers tests

2012-03-15 Thread Robert Muir (Created) (JIRA)

tie MockGraphTokenFilter into all analyzers tests
-

 Key: LUCENE-3873
 URL: https://issues.apache.org/jira/browse/LUCENE-3873
 Project: Lucene - Java
  Issue Type: Task
  Components: modules/analysis
Reporter: Robert Muir


Mike made a MockGraphTokenFilter on LUCENE-3848.

Many filters currently arent tested with anything but a simple tokenstream.
we should test them with this, too, it might find bugs (zero-length terms,
stacked terms/synonyms, etc)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3251) dynamically add field to schema

2012-03-15 Thread Yonik Seeley (Created) (JIRA)

dynamically add field to schema
---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley


One related piece of functionality needed for SOLR-3250 is the ability to 
dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3848) basetokenstreamtestcase should fail if tokenstream starts with posinc=0

2012-03-15 Thread Michael McCandless (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230341#comment-13230341
 ] 

Michael McCandless commented on LUCENE-3848:


+1


 basetokenstreamtestcase should fail if tokenstream starts with posinc=0
 ---

 Key: LUCENE-3848
 URL: https://issues.apache.org/jira/browse/LUCENE-3848
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3848-MockGraphTokenFilter.patch, 
 LUCENE-3848.patch, LUCENE-3848.patch


 This is meaningless for a tokenstream to start with posinc=0,
 Its also caused problems and hairiness in the indexer (LUCENE-1255, 
 LUCENE-1542),
 and it makes senseless tokenstreams. We should add a check and fix any that 
 do this.
 Furthermore the same bug can exist in removing-filters if they have 
 enablePositionIncrements=false.
 I think this option is useful: but it shouldnt mean 'allow broken 
 tokenstream', it just means we
 don't add gaps. 
 If you remove tokens with enablePositionIncrements=false it should not cause 
 the TS to start with
 positionincrement=0, and it shouldnt 'restructure' the tokenstream (e.g. 
 moving synonyms on top of a different word).
 It should just not add any 'holes'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3251) dynamically add field to schema

2012-03-15 Thread Yonik Seeley (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-3251:
---

Attachment: SOLR-3251.patch

Here's a quick start... no tests or external API yet.

 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()


[ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230345#comment-13230345
 ] 

Robert Muir commented on LUCENE-3872:
-

{quote}
in this case IW.close would detect the missing call to commit, call commit, and 
call commit again to save any changes done after the prepareCommit and before 
close.
{quote}

I think that would make it even more lenient and complicated and worse. I guess 
i feel close() should really be rollback(). But this is likely ridiculous to 
change.
So on second thought I think patch is good... if someone is handling 
exceptional cases like this they should be thinking about using rollback() 
anyway,
and they have this option still.

I wasn't really against the patch anyway, just whining. its definitely an 
improvement on the current behavior, let's do it.


 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch, LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230346#comment-13230346
]

Harley Parks commented on SOLR-2155:

Oh.. I may have messed up my build, since i did not include the solr 3.4 jar
files in the class path...
is there an enviorment variable that maven will use? such as CLASSPATH or a lib
folder in the project being built?

Geospatial search using geohash prefixes

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3874) bogus positions create a corrumpt index

2012-03-15 Thread Robert Muir (Created) (JIRA)

bogus positions create a corrumpt index
---

 Key: LUCENE-3874
 URL: https://issues.apache.org/jira/browse/LUCENE-3874
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir


Its pretty common that positionIncrement can overflow, this happens really 
easily 
if people write analyzers that don't clearAttributes().

It used to be the case that if this happened (and perhaps still is in 3.x, i 
didnt check),
that IW would throw an exception.

But i couldnt find the code checking this, I wrote a test and it makes a 
corrumpt index...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3874) bogus positions create a corrumpt index


 [ 
https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3874:


Attachment: LUCENE-3874_test.patch

Simple test that overflows posinc.

Output is:
{noformat}
junit-sequential:
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.239 sec
[junit] 
[junit] - Standard Output ---
[junit] CheckIndex failed
[junit] Segments file=segments_1 numSegments=1 version=4.0 
format=FORMAT_4_0 [Lucene 4.0]
[junit]   1 of 1: name=_0 docCount=1
[junit] codec=SimpleText
[junit] compound=false
[junit] hasProx=true
[junit] numFiles=4
[junit] size (MB)=0
[junit] diagnostics = {os.version=3.0.0-14-generic, os=Linux, 
lucene.version=4.0-SNAPSHOT, source=flush, os.arch=amd64, 
java.version=1.6.0_24, java.vendor=Sun Microsystems Inc.}
[junit] has deletions [delGen=-1]
[junit] test: open reader.OK
[junit] test: fields..OK [1 fields]
[junit] test: field norms.OK [1 fields]
[junit] test: terms, freq, prox...ERROR: java.lang.RuntimeException: 
term [66 6f 6f]: doc 0: pos -2 is out of bounds
[junit] java.lang.RuntimeException: term [66 6f 6f]: doc 0: pos -2 is out 
of bounds
[junit] at 
org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:860)
...
{noformat}

 bogus positions create a corrumpt index
 ---

 Key: LUCENE-3874
 URL: https://issues.apache.org/jira/browse/LUCENE-3874
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3874_test.patch


 Its pretty common that positionIncrement can overflow, this happens really 
 easily 
 if people write analyzers that don't clearAttributes().
 It used to be the case that if this happened (and perhaps still is in 3.x, i 
 didnt check),
 that IW would throw an exception.
 But i couldnt find the code checking this, I wrote a test and it makes a 
 corrumpt index...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230373#comment-13230373
 ] 

Yonik Seeley commented on SOLR-3251:


Any ideas for an external API?

We could use a single entry point for all things schema related...
http://localhost:8983/solr/schema
{addField:{myfield:{type:int ...}}

Or more specific to fields...
http://localhost:8983/solr/fields
 OR
PUT/POST to http://localhost:8983/solr/schema/fields  (nesting all schema 
related stuff under schema would help pollute the namespace less)
{myfield:{type:int ...}}

I'm leaning toward the last option.  Thoughts?



 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3874) bogus positions create a corrumpt index


 [ 
https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3874:


Affects Version/s: 3.6

3.x too: just s/TextField/Field to port the test

 bogus positions create a corrumpt index
 ---

 Key: LUCENE-3874
 URL: https://issues.apache.org/jira/browse/LUCENE-3874
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3874_test.patch


 Its pretty common that positionIncrement can overflow, this happens really 
 easily 
 if people write analyzers that don't clearAttributes().
 It used to be the case that if this happened (and perhaps still is in 3.x, i 
 didnt check),
 that IW would throw an exception.
 But i couldnt find the code checking this, I wrote a test and it makes a 
 corrumpt index...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230381#comment-13230381
 ] 

Ryan McKinley commented on SOLR-3251:
-

Does this imply that the schema would be writeable? 

The PUT/POST option is nicer




 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230390#comment-13230390
 ] 

Ryan McKinley commented on SOLR-3251:
-

What are the thoughts on error handling?  are you only able to add fields that 
don't exist?  If they exist in the schema but not in the index?  What about if 
the index Analyzer is identical, but the query Analyzer has changed?  

 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema

[
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230393#comment-13230393
]

Yonik Seeley commented on SOLR-3251:

bq. Does this imply that the schema would be writeable?

The in-memory schema object yes.
The question is how to persist changes. I was thinking it might be easiest to
keep a separate file alongside schema.xml for dynamically added fields for now.
The term dynamicFields has already been taken though and we probably
shouldn't overload it. Maybe extra_fields.json? Or maybe even
schema.json/schema.yaml that acts as an extension of schema.xml (and could
acquire additional features over time such as the ability to define types too?)

But a separate file that just lists fields will be much quicker (and easier) to
update. Reloading a full schema.xml (along with type instantiation) would
currently be somewhat prohibitive.

dynamically add field to schema
---

Key: SOLR-3251
URL: https://issues.apache.org/jira/browse/SOLR-3251
Project: Solr
Issue Type: New Feature
Reporter: Yonik Seeley
Attachments: SOLR-3251.patch

One related piece of functionality needed for SOLR-3250 is the ability to
dynamically add a field to the schema.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

[
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230398#comment-13230398
]

Harley Parks commented on SOLR-2155:

All of the Class Paths in the solr1.0.3 project point to apache solr 3.4
libraries on the apache website... so no action needed, to answer my own
question. I'm stumped.

Geospatial search using geohash prefixes

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3874) bogus positions create a corrumpt index

2012-03-15 Thread Sami Siren (Issue Comment Edited) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3874:


Attachment: LUCENE-3874.patch

first cut at a patch, throws IllegalArgumentException and aborts the doc 
(ensuring fieldState never sees the overflow since i dont trust what happens to 
it after this!)

 bogus positions create a corrumpt index
 ---

 Key: LUCENE-3874
 URL: https://issues.apache.org/jira/browse/LUCENE-3874
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3874.patch, LUCENE-3874_test.patch


 Its pretty common that positionIncrement can overflow, this happens really 
 easily 
 if people write analyzers that don't clearAttributes().
 It used to be the case that if this happened (and perhaps still is in 3.x, i 
 didnt check),
 that IW would throw an exception.
 But i couldnt find the code checking this, I wrote a test and it makes a 
 corrumpt index...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema

2012-03-15 Thread Sami Siren (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230399#comment-13230399
 ] 

Sami Siren commented on SOLR-3251:
--

I like the latter option more. 

 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3874) bogus positions create a corrumpt index

2012-03-15 Thread Michael McCandless (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230402#comment-13230402
 ] 

Michael McCandless commented on LUCENE-3874:


+1

Crazy we don't catch this already...

 bogus positions create a corrumpt index
 ---

 Key: LUCENE-3874
 URL: https://issues.apache.org/jira/browse/LUCENE-3874
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3874.patch, LUCENE-3874_test.patch


 Its pretty common that positionIncrement can overflow, this happens really 
 easily 
 if people write analyzers that don't clearAttributes().
 It used to be the case that if this happened (and perhaps still is in 3.x, i 
 didnt check),
 that IW would throw an exception.
 But i couldnt find the code checking this, I wrote a test and it makes a 
 corrumpt index...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3251) dynamically add field to schema


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230399#comment-13230399
 ] 

Sami Siren edited comment on SOLR-3251 at 3/15/12 6:32 PM:
---

bq. Any ideas for an external API?
I like the latter option more. 

  was (Author: siren):
I like the latter option more. 
  
 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230406#comment-13230406
 ] 

Ryan McKinley commented on SOLR-3251:
-

bq. separate file alongside schema.xml 

This makes sense. 

As is, the ad-hoc naming conventions in schema make writing out the full schema 
pretty daunting.

 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3207) Add field name validation

2012-03-15 Thread Luca Cavanna (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230408#comment-13230408
 ] 

Luca Cavanna commented on SOLR-3207:


The first letter should be ok as checked in my patch. Regarding the trailing 
characters, do you mean we shouldn't use isJavaIdentifierPart anymore but 
something else? That's even more restrictive than my patch since I've used the 
existing rules applied while parsing the fl parameter (ReturnFields class). No 
problem for me, are we all sure we want to proceed this way?

I'll update my patch later on. Then I'd document this within the Schema wiki 
page. 

That's a big change, any opinion is welcome!

 Add field name validation
 -

 Key: SOLR-3207
 URL: https://issues.apache.org/jira/browse/SOLR-3207
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.0
Reporter: Luca Cavanna
 Fix For: 4.0

 Attachments: SOLR-3207.patch


 Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would 
 be useful to add some kind of validation regarding the field names you can 
 use on Solr.
 The objective would be adding consistency, allowing only field names that you 
 can then use within fl, sorting etc.
 The rules, taken from the actual StrParser behaviour, seem to be the 
 following: 
 - same used for java identifiers (Character#isJavaIdentifierPart), plus the 
 use of trailing '.' and '-'
 - for the first character the rule is Character#isJavaIdentifierStart minus 
 '$' (The dash can't be used as first character (SOLR-3191) for example)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3207) Add field name validation

[
https://issues.apache.org/jira/browse/SOLR-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230422#comment-13230422
]

Yonik Seeley commented on SOLR-3207:

bq. Regarding the trailing characters, do you mean we shouldn't use
isJavaIdentifierPart anymore but something else?

That was just a shortcut... looking again, it's pretty open (maybe more open
than we want?) esp since unicode changes over time. Anyway,
isJavaIdentifierPart doesn't include - or . If people do need another
separator type character, we could allow $ too (just not as the first char,
since that's taken by variable dereferencing).

bq. That's even more restrictive than my patch since I've used the existing
rules applied while parsing the fl parameter (ReturnFields class).

Allowing '-' in the fl was just to resolve that regression for people who
already used fieldnames like that and are upgrading.
If we want to start validating field names strictly, then we should bump the
schema version number (and should skip validating when the version number is
less than that).

Add field name validation
-

Key: SOLR-3207
URL: https://issues.apache.org/jira/browse/SOLR-3207
Project: Solr
Issue Type: Improvement
Affects Versions: 4.0
Reporter: Luca Cavanna
Fix For: 4.0

Attachments: SOLR-3207.patch

Given the SOLR-2444 updated fl syntax and the SOLR-2719 regression, it would
be useful to add some kind of validation regarding the field names you can
use on Solr.
The objective would be adding consistency, allowing only field names that you
can then use within fl, sorting etc.
The rules, taken from the actual StrParser behaviour, seem to be the
following:
- same used for java identifiers (Character#isJavaIdentifierPart), plus the
use of trailing '.' and '-'
- for the first character the rule is Character#isJavaIdentifierStart minus
'$' (The dash can't be used as first character (SOLR-3191) for example)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()


[ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230436#comment-13230436
 ] 

Dawid Weiss commented on LUCENE-3872:
-

bq. I guess i feel close() should really be rollback().

Yeah... I think this feeling of unease is fairly common -- see JDBC's 
Connection javadoc on close, for example: It is strongly recommended that an 
application explicitly commits or rolls back an active transaction prior to 
calling the close method. If the close method is called and there is an active 
transaction, the results are implementation-defined.

 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch, LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3872) Index changes are lost if you call prepareCommit() then close()

2012-03-15 Thread Michael McCandless (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-3872.


Resolution: Fixed

Thanks Tim!

 Index changes are lost if you call prepareCommit() then close()
 ---

 Key: LUCENE-3872
 URL: https://issues.apache.org/jira/browse/LUCENE-3872
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3872.patch, LUCENE-3872.patch


 You are supposed to call commit() after calling prepareCommit(), but... if 
 you forget, and call close() after prepareCommit() without calling commit(), 
 then any changes done after the prepareCommit() are silently lost (including 
 adding/deleting docs, but also any completed merges).
 Spinoff from java-user thread lots of .cfs (compound files) in the index 
 directory from Tim Bogaert.
 I think to fix this, IW.close should throw an IllegalStateException if 
 prepareCommit() was called with no matching call to commit().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3874) bogus positions create a corrumpt index

2012-03-15 Thread Robert Muir (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3874.
-

   Resolution: Fixed
Fix Version/s: 4.0
   3.6

 bogus positions create a corrumpt index
 ---

 Key: LUCENE-3874
 URL: https://issues.apache.org/jira/browse/LUCENE-3874
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3874.patch, LUCENE-3874_test.patch


 Its pretty common that positionIncrement can overflow, this happens really 
 easily 
 if people write analyzers that don't clearAttributes().
 It used to be the case that if this happened (and perhaps still is in 3.x, i 
 didnt check),
 that IW would throw an exception.
 But i couldnt find the code checking this, I wrote a test and it makes a 
 corrumpt index...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230503#comment-13230503
 ] 

Dawid Weiss commented on LUCENE-3867:
-

I just peeked at OpenJDK sources and addressSize() is defined as this:
{code}
// See comment at file start about UNSAFE_LEAF
//UNSAFE_LEAF(jint, Unsafe_AddressSize())
UNSAFE_ENTRY(jint, Unsafe_AddressSize(JNIEnv *env, jobject unsafe))
  UnsafeWrapper(Unsafe_AddressSize);
  return sizeof(void*);
UNSAFE_END
{code}

In this light this switch:
{code}
switch (addressSize) {
  case 4:
is64Bit = Boolean.FALSE;
break;
  case 8:
is64Bit = Boolean.TRUE;
break;
}
{code}
Becomes interesting. Do you know of any architecture with pointers different 
than 4 or 8 bytes? :)

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3848) basetokenstreamtestcase should fail if tokenstream starts with posinc=0

[
https://issues.apache.org/jira/browse/LUCENE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230510#comment-13230510
]

Robert Muir commented on LUCENE-3848:
-

I think this is ready to go in, ill wait a bit.

I didn't make any changes re: graph-restructuring, though I still feel we
should fix this, but it means
dealing with backwards compatibility, etc.

The changes in this patch are backwards compatible, in the sense that consumers
are already correcting
'initial posInc=0' to posinc=1 anyway.

basetokenstreamtestcase should fail if tokenstream starts with posinc=0
---

Key: LUCENE-3848
URL: https://issues.apache.org/jira/browse/LUCENE-3848
Project: Lucene - Java
Issue Type: Bug
Reporter: Robert Muir
Fix For: 4.0

Attachments: LUCENE-3848-MockGraphTokenFilter.patch,
LUCENE-3848.patch, LUCENE-3848.patch

This is meaningless for a tokenstream to start with posinc=0,
Its also caused problems and hairiness in the indexer (LUCENE-1255,
LUCENE-1542),
and it makes senseless tokenstreams. We should add a check and fix any that
do this.
Furthermore the same bug can exist in removing-filters if they have
enablePositionIncrements=false.
I think this option is useful: but it shouldnt mean 'allow broken
tokenstream', it just means we
don't add gaps.
If you remove tokens with enablePositionIncrements=false it should not cause
the TS to start with
positionincrement=0, and it shouldnt 'restructure' the tokenstream (e.g.
moving synonyms on top of a different word).
It should just not add any 'holes'.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230529#comment-13230529
 ] 

Dawid Weiss commented on LUCENE-3867:
-

A few more exotic jits from OpenJDK (all seem to be using explicit 8 byte ref 
size on 64-bit:
{noformat}
 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-jamvm
[junit] JVM: OpenJDK Runtime Environment, JamVM, Robert Lougher, 
1.6.0-devel, Java Virtual Machine Specification, Sun Microsystems Inc., 
1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8

 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-jamvm 
 -XX:+UseCompressedOops
[junit] JVM: OpenJDK Runtime Environment, JamVM, Robert Lougher, 
1.6.0-devel, Java Virtual Machine Specification, Sun Microsystems Inc., 
1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8

 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-cacao
[junit] JVM: OpenJDK Runtime Environment, CACAO, CACAOVM - Verein zur 
Foerderung der freien virtuellen Maschine CACAO, 1.1.0pre2, Java Virtual 
Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., 
null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8

 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-server
[junit] JVM: OpenJDK Runtime Environment, OpenJDK 64-Bit Server VM, Sun 
Microsystems Inc., 20.0-b11, Java Virtual Machine Specification, Sun 
Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 4

 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-server 
 -XX:-UseCompressedOops
[junit] JVM: OpenJDK Runtime Environment, OpenJDK 64-Bit Server VM, Sun 
Microsystems Inc., 20.0-b11, Java Virtual Machine Specification, Sun 
Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (LUCENE-3875) ValueSourceFilter

2012-03-15 Thread Andrew Morrison (Created) (JIRA)

ValueSourceFilter
-

 Key: LUCENE-3875
 URL: https://issues.apache.org/jira/browse/LUCENE-3875
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Andrew Morrison
 Attachments: LUCENE-3875.patch

A ValueSourceFilter is a filter that takes a ValueSource and a threshold value, 
filtering out documents for which their value returned by the ValueSource is 
below the threshold.

We use the ValueSourceFilter for filtering documents based on their value in an 
ExternalFileField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3875) ValueSourceFilter

2012-03-15 Thread Andrew Morrison (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Morrison updated LUCENE-3875:


Attachment: LUCENE-3875.patch

 ValueSourceFilter
 -

 Key: LUCENE-3875
 URL: https://issues.apache.org/jira/browse/LUCENE-3875
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Andrew Morrison
 Attachments: LUCENE-3875.patch


 A ValueSourceFilter is a filter that takes a ValueSource and a threshold 
 value, filtering out documents for which their value returned by the 
 ValueSource is below the threshold.
 We use the ValueSourceFilter for filtering documents based on their value in 
 an ExternalFileField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230535#comment-13230535
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Mac:
{noformat}
 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true
[junit] JVM: Java(TM) SE Runtime Environment, Java HotSpot(TM) 64-Bit 
Server VM, Apple Inc., 20.4-b02-402, Java Virtual Machine Specification, Sun 
Microsystems Inc., 1.6.0_29, Apple Inc., null, 
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 4

 ant test-core -Dtestcase=TestRam* -Dtests.verbose=true -Dargs=-server 
 -XX:-UseCompressedOops
[junit] JVM: Java(TM) SE Runtime Environment, Java HotSpot(TM) 64-Bit 
Server VM, Apple Inc., 20.4-b02-402, Java Virtual Machine Specification, Sun 
Microsystems Inc., 1.6.0_29, Apple Inc., null, 
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect

2012-03-15 Thread Mark Miller (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230543#comment-13230543
 ] 

Mark Miller commented on LUCENE-3867:
-

Nooo!!! My eyes I'm pretty sure my liver has just been virally licensed!

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3869) possible hang in UIMATypeAwareAnalyzerTest


[ 
https://issues.apache.org/jira/browse/LUCENE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230549#comment-13230549
 ] 

Robert Muir commented on LUCENE-3869:
-

I think i got lucky yesterday twice... its pretty hard to reproduce this now. 
Maybe a thread-safety issue?

I'll look more. My computer has been known to be crazy... dont waste time on 
this one Tommaso, I'll try to dig in more.

 possible hang in UIMATypeAwareAnalyzerTest
 --

 Key: LUCENE-3869
 URL: https://issues.apache.org/jira/browse/LUCENE-3869
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/analysis
Affects Versions: 4.0
Reporter: Robert Muir

 Just testing an unrelated patch, I was hung (with 100% cpu) in 
 UIMATypeAwareAnalyzerTest.
 I'll attach stacktrace at the moment of the hang.
 The fact we get a seed in the actual stacktraces for cases like this is 
 awesome! Thanks Dawid!
 I don't think it reproduces 100%, but I'll try beasting this seed to see if i 
 can reproduce the hang:
 should be 'ant test -Dtestcase=UIMATypeAwareAnalyzerTest 
 -Dtests.seed=-262aada3325aa87a:-44863926cf5c87e9:5c8c471d901b98bd' 
 from what I can see.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

2012-03-15 Thread Ryan McKinley (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved LUCENE-3795.
---

Resolution: Fixed

I will mark this resolved and we can start new issues for ongoing problems.

The next big step is to integrate with solr.

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module


[ 
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230565#comment-13230565
 ] 

Ryan McKinley commented on LUCENE-3795:
---

did not mean to 'resolve' the Math.toRadians issue though -- I think we should 
change that back to multiplication...  Math.* seems to be pretty clunky

 Replace spatial contrib module with LSP's spatial-lucene module
 ---

 Key: LUCENE-3795
 URL: https://issues.apache.org/jira/browse/LUCENE-3795
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.0


 I propose that Lucene's spatial contrib module be replaced with the 
 spatial-lucene module within Lucene Spatial Playground (LSP).  LSP has been 
 in development for approximately 1 year by David Smiley, Ryan McKinley, and 
 Chris Male and we feel it is ready.  LSP is here: 
 http://code.google.com/p/lucene-spatial-playground/  and the spatial-lucene 
 module is intuitively in svn/trunk/spatial-lucene/.
 I'll add more comments to prevent the issue description from being too long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230572#comment-13230572
 ] 

Dawid Weiss commented on LUCENE-3867:
-

Ok, right, sorry, let me scramble for intellectual property protection reasons:
{noformat}
// See cemnmot at flie sratt abuot U_ANEESAFLF 
/
/ ULAAFEN_SEF (jnit, UfdAsnerS_zsiaedse ())
UEATERSNFN_Y (jint, UnidsdserSAasfe_ze (JNnEIv * env, jcbjeot unfsae))
UesWrpfapaner ( UdenfsSseAazs_drie ); 
rreutn seiozf (void *
;)
UNEF_SNEAD
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3795) Replace spatial contrib module with LSP's spatial-lucene module

[
https://issues.apache.org/jira/browse/LUCENE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230574#comment-13230574
]

Yonik Seeley commented on LUCENE-3795:
--

bq. I'd be very surprised to hear if this is true.

If Math.toRadians had been written as x*(PI/180.0) then the compiler would have
done constant folding and it would simply be multiplication by a constant. But
it's unfortunately written as x/180.0*PI (for no good reason in this case), and
the compiler/JVM is not allowed to do the simple transformation by itself.
That's why we do it.

Sometimes knowing how optimizers work and the restrictions on them allow one to
know what will be faster or slower without benchmarking. I did benchmark it
after the fact (after you questioned it), and it was indeed the case that
Math.toRadians was much slower than a simple multiply.

Replace spatial contrib module with LSP's spatial-lucene module
---

Key: LUCENE-3795
URL: https://issues.apache.org/jira/browse/LUCENE-3795
Project: Lucene - Java
Issue Type: New Feature
Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
Fix For: 4.0

I propose that Lucene's spatial contrib module be replaced with the
spatial-lucene module within Lucene Spatial Playground (LSP). LSP has been
in development for approximately 1 year by David Smiley, Ryan McKinley, and
Chris Male and we feel it is ready. LSP is here:
http://code.google.com/p/lucene-spatial-playground/ and the spatial-lucene
module is intuitively in svn/trunk/spatial-lucene/.
I'll add more comments to prevent the issue description from being too long.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-3222) Pull optimal cache warming queries from a warm solr instance

2012-03-15 Thread Russell Black (Closed) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russell Black closed SOLR-3222.
---

Resolution: Incomplete

Turns out this patch doesn't work, since there is no reliable way to turn a 
Query object into URL query parameters.  I ended up solving the problem with a 
cache plugin.  Let me know if you're interested in the solution and I can post 
code. 

 Pull optimal cache warming queries from a warm solr instance
 

 Key: SOLR-3222
 URL: https://issues.apache.org/jira/browse/SOLR-3222
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 3.5, 4.0
Reporter: Russell Black
  Labels: patch, performance
 Attachments: SOLR-3222-autowarm.patch


 Ever wondered what queries to use to prime your cache?  This patch allows you 
 to query a warm running instance for a list of warming queries.  The list is 
 generated from the server's caches, meaning you get back an optimal set of 
 queries.  The set is  optimal to the extent that the caches are optimized.  
 The queries are returned in a format that can be consumed by the 
 {code:xml}listener event=firstSearcher 
 class=solr.QuerySenderListener{code} section of {{solrconfig.xml}}.  
 One can use this feature to generate a static set of good warming queries to 
 place in {{solrconfig.xml}} under {code:xml}listener event=firstSearcher 
 class=solr.QuerySenderListener{code}
 It can even be used in a dynamic fashion like this:
 {code:xml}
 listener event=firstSearcher class=solr.QuerySenderListener
   xi:include href=http://host/solr/core/autowarm; xpointer=element(/1/2) 
 xmlns:xi=http://www.w3.org/2001/XInclude/
 /listener
 {code}
 which can work well in certain distributed load-balanced architectures, 
 although in production it would be wise to add an {{xi:fallback}} element 
 to the include in the event that the host is down.
 I implemented this by introducing a new request handler:
 {code:xml}
   requestHandler name=/autowarm class=solr.AutoWarmRequestHandler /
 {code}
 The request handler pulls a configurable number of top keys from the 
 {{filterCache}},{{fieldValueCache}}, and {{queryResultCache}}.  For each key, 
 it constructs a query that will cause that key to be placed in the associated 
 cache.  The list of constructed queries are then returned in the response.  
 Patch to follow.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3251) dynamically add field to schema

2012-03-15 Thread Hoss Man (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230627#comment-13230627
 ] 

Hoss Man commented on SOLR-3251:


bq. Any ideas for an external API?

I think the best way to support this externally is using the existing mechanism 
for plugins...

* a RequestHandler people can register (if they want to support external 
clients programaticly modifying the schema) that accepts ContentStreams 
containing whatever payload structure makes sense given the functionality.
* an UpdateProcessor people can register (if they want to support stuff like 
SOLR-3250 where clients adding documents can submit any field name and a type 
is added based on the type of hte value) which could be configured with 
mappings of java types to fieldTypes and rules about other field attributes -- 
ie if a client submits a new field=value with a java.lang.Integer value, 
create a new tint field with that name and set stored=true.

 dynamically add field to schema
 ---

 Key: SOLR-3251
 URL: https://issues.apache.org/jira/browse/SOLR-3251
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Attachments: SOLR-3251.patch


 One related piece of functionality needed for SOLR-3250 is the ability to 
 dynamically add a field to the schema.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect


[ 
https://issues.apache.org/jira/browse/LUCENE-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230631#comment-13230631
 ] 

Uwe Schindler commented on LUCENE-3867:
---

bq. Becomes interesting. Do you know of any architecture with pointers 
different than 4 or 8 bytes? 

When I was writing that code, I was thinking a very long time about: Hm, should 
I add a default case saying:

{noformat}
default:
  throw new Error(Lucene does not like architectures with pointer size  + 
addressSize)
{noformat}

But then I decided: If there is an architecture with a pointer size of 6, does 
this break Lucene really? Hm, maybe I should have added a comment there:

{noformat}
default:
  // this is the philosophical case of Lucene reaching an architecture 
returning something different here
{noformat}

 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect
 -

 Key: LUCENE-3867
 URL: https://issues.apache.org/jira/browse/LUCENE-3867
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Trivial
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3867-compressedOops.patch, LUCENE-3867.patch, 
 LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch, LUCENE-3867.patch


 RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: 
 NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The 
 NUM_BYTES_OBJECT_REF part should not be included, at least not according to 
 this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
 {quote}
 A single-dimension array is a single object. As expected, the array has the 
 usual object header. However, this object head is 12 bytes to accommodate a 
 four-byte array length. Then comes the actual array data which, as you might 
 expect, consists of the number of elements multiplied by the number of bytes 
 required for one element, depending on its type. The memory usage for one 
 element is 4 bytes for an object reference ...
 {quote}
 While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel 
 about including such helper methods in RUE, as static, stateless, methods? 
 It's not perfect, there's some room for improvement I'm sure, here it is:
 {code}
   /**
* Computes the approximate size of a String object. Note that if this 
 object
* is also referenced by another object, you should add
* {@link RamUsageEstimator#NUM_BYTES_OBJECT_REF} to the result of this
* method.
*/
   public static int sizeOf(String str) {
   return 2 * str.length() + 6 // chars + additional safeness for 
 arrays alignment
   + 3 * RamUsageEstimator.NUM_BYTES_INT // String 
 maintains 3 integers
   + RamUsageEstimator.NUM_BYTES_ARRAY_HEADER // 
 char[] array
   + RamUsageEstimator.NUM_BYTES_OBJECT_HEADER; // 
 String object
   }
 {code}
 If people are not against it, I'd like to also add sizeOf(int[] / byte[] / 
 long[] / double[] ... and String[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3867) RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is incorrect