Re: Welcome Itamar Syn-Hershk​o as a new committer

2012-05-23 Thread Itamar Syn-Hershko
Thanks guys

On Wed, May 23, 2012 at 1:14 AM, zoolette gaufre...@gmail.com wrote:

 Welcome in Itamar !

 2012/5/22 Prescott Nasser geobmx...@hotmail.com

 
  Hey all,
  I'd like to officially welcome Itamar as a new committer. I know the
  community appreciates the work you've been doing with the Spatial contrib
  project and the past help you've provided on the mailing lists.
  Please join me in welcoming Itamar,
  ~Prescott



PyLucene3.6 windows binaries

2012-05-23 Thread Thomas Koch
PyLucene 3.6.0 for Python 2.6/2.7 is now available as pre-compiled binary
for windows (32bit) from the pylucene-extra site at
http://code.google.com/a/apache-extras.org/p/pylucene-extra

Note: pylucene-extra is not an official Apache project, but rather an
attempt to lower the entry barrier to PyLucene by providing some prebuilt
eggs. Further contributions (for other platforms or combinations of 32/64bit
and Python2.x etc.) are highly welcome!

best regards

Thomas 





[jira] [Resolved] (SOLR-3464) softCommit option for HttpSolrServer commit method

2012-05-23 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved SOLR-3464.
---

   Resolution: Fixed
Fix Version/s: 4.0
 Assignee: Tommaso Teofili

 softCommit option for HttpSolrServer commit method
 --

 Key: SOLR-3464
 URL: https://issues.apache.org/jira/browse/SOLR-3464
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0
Reporter: Marco Crivellaro
Assignee: Tommaso Teofili
Priority: Minor
 Fix For: 4.0


 HttpSolrServer.commit method doesn't have softCommit option which appears to
 be an option available for the commit command:
 http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #162

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/162/

--
[...truncated 16267 lines...]
   [junit4]   2at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   [junit4]   2at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   [junit4]   2at java.lang.Thread.run(Thread.java:662)
   [junit4]   2 
   [junit4]   2 61062 T2869 oas.SolrTestCaseJ4.tearDown ###Ending test
   [junit4]   1 replicate slave to master
   [junit4]   2 NOTE: reproduce with: ant test 
-Dtestcase=TestReplicationHandler -Dtests.method=test 
-Dtests.seed=A486C222F6A861EE -Dtests.locale=hr -Dtests.timezone=Asia/Jakarta 
-Dargs=-Dfile.encoding=Cp1252
   [junit4]   1 
   [junit4]   2
   [junit4] (@AfterClass output)
   [junit4]   2 61089 T2869 oasc.CoreContainer.shutdown Shutting down 
CoreContainer instance=831846536
   [junit4]   2 61089 T2869 oasc.SolrCore.close [collection1]  CLOSING 
SolrCore org.apache.solr.core.SolrCore@129b3cec
   [junit4]   2 61090 T2869 oasc.SolrCore.closeSearcher [collection1] Closing 
main searcher on request.
   [junit4]   2 61090 T2869 oasu.DirectUpdateHandler2.close closing 
DirectUpdateHandler2{commits=2,autocommits=0,soft 
autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=494,cumulative_deletesById=0,cumulative_deletesByQuery=1,cumulative_errors=0}
   [junit4]   2 61099 T2869 oejsh.ContextHandler.doStop stopped 
o.e.j.s.ServletContextHandler{/solr,null}
   [junit4]   2 61154 T2869 oasc.CoreContainer.shutdown Shutting down 
CoreContainer instance=1517759769
   [junit4]   2 61155 T2869 oasc.SolrCore.close [collection1]  CLOSING 
SolrCore org.apache.solr.core.SolrCore@27549904
   [junit4]   2 61155 T2869 oasc.SolrCore.closeSearcher [collection1] Closing 
main searcher on request.
   [junit4]   2 61156 T2869 oasu.DirectUpdateHandler2.close closing 
DirectUpdateHandler2{commits=1,autocommits=0,soft 
autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
   [junit4]   2 61158 T2869 oejsh.ContextHandler.doStop stopped 
o.e.j.s.ServletContextHandler{/solr,null}
   [junit4]   2 61235 T2869 oas.SolrTestCaseJ4.deleteCore ###deleteCore
   [junit4]   2 NOTE: test params are: codec=Lucene40, 
sim=RandomSimilarityProvider(queryNorm=false,coord=false): {}, locale=hr, 
timezone=Asia/Jakarta
   [junit4]   2 NOTE: Windows 7 6.1 amd64/Sun Microsystems Inc. 1.6.0_32 
(64-bit)/cpus=2,threads=1,free=147193000,total=271253504
   [junit4]   2 NOTE: All tests run in this JVM: [TestChineseTokenizerFactory, 
TestExtendedDismaxParser, TestQueryUtils, SampleTest, TestPseudoReturnFields, 
TestNumberUtils, TestBeiderMorseFilterFactory, TestMultiCoreConfBootstrap, 
TestItalianLightStemFilterFactory, TestSolrDeletionPolicy1, 
RAMDirectoryFactoryTest, TestLFUCache, DocumentAnalysisRequestHandlerTest, 
TestSpanishLightStemFilterFactory, TestSolrCoreProperties, 
IndexBasedSpellCheckerTest, JsonLoaderTest, TestValueSourceCache, 
UniqFieldsUpdateProcessorFactoryTest, ZkNodePropsTest, 
TestUAX29URLEmailTokenizerFactory, JSONWriterTest, SortByFunctionTest, 
FieldMutatingUpdateProcessorTest, TestPropInject, TestGermanStemFilterFactory, 
TestTrie, ZkSolrClientTest, DateMathParserTest, SpellCheckComponentTest, 
TestTypeTokenFilterFactory, HighlighterConfigTest, TestQuerySenderListener, 
PrimUtilsTest, IndexReaderFactoryTest, TestNorwegianLightStemFilterFactory, 
SystemInfoHandlerTest, TestLRUCache, FullSolrCloudDistribCmdsTest, 
TestGermanNormalizationFilterFactory, TestFunctionQuery, 
CommonGramsQueryFilterFactoryTest, OpenExchangeRatesOrgProviderTest, 
SolrCoreCheckLockOnStartupTest, TestPortugueseStemFilterFactory, OverseerTest, 
TestIndonesianStemFilterFactory, TestPerFieldSimilarity, TestHashPartitioner, 
TestOmitPositions, SoftAutoCommitTest, StandardRequestHandlerTest, 
TestRecovery, TestBM25SimilarityFactory, TestRangeQuery, StatsComponentTest, 
DistributedTermsComponentTest, TestDocSet, TestBinaryField, 
TestPhraseSuggestions, TestCollationKeyFilterFactory, DebugComponentTest, 
TestShingleFilterFactory, TestJoin, TestUtils, ReturnFieldsTest, 
SimpleFacetsTest, TestIndexingPerformance, MBeansHandlerTest, 
TestPersianNormalizationFilterFactory, TestRemoveDuplicatesTokenFilterFactory, 
TestRussianLightStemFilterFactory, PrimitiveFieldTypeTest, 
LeaderElectionIntegrationTest, TestWordDelimiterFilterFactory, 
TestCJKTokenizerFactory, IndexSchemaTest, TimeZoneUtilsTest, TestSynonymMap, 
AutoCommitTest, SOLR749Test, BadIndexSchemaTest, TestChineseFilterFactory, 
TermsComponentTest, BasicFunctionalityTest, TestSynonymFilterFactory, 
UUIDFieldTest, DateFieldTest, TestArbitraryIndexDir, 
TestLMDirichletSimilarityFactory, SolrIndexConfigTest, TestJmxMonitoredMap, 

Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #95

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/95/

--
[...truncated 11919 lines...]
   [junit4]   2 20820 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created /terms: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 20821 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created spellCheckCompRH: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 20821 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created spellCheckCompRH_Direct: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 20821 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created spellCheckCompRH1: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 20822 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created tvrh: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 20822 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created /mlt: solr.MoreLikeThisHandler
   [junit4]   2 20823 T3143 oasc.RequestHandlers.initHandlersFromConfig 
created /debug/dump: solr.DumpRequestHandler
   [junit4]   2 20825 T3143 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60
   [junit4]   2 20827 T3143 oasc.SolrCore.initDeprecatedSupport WARNING 
solrconfig.xml uses deprecated admin/gettableFiles, Please update your config 
to use the ShowFileRequestHandler.
   [junit4]   2 20830 T3143 oasc.SolrCore.initDeprecatedSupport WARNING adding 
ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, 
SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, 
BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, 
SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, 
SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, 
SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, 
SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, 
SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, 
SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, 
HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, STEMDICT.TXT, 
SCHEMA-PHRASESUGGEST.XML, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, 
STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, 
SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, 
SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, 
SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, 
SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, 
OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, 
SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, 
BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, 
SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, 
SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, 
SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, 
SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, 
SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, 
SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, 
SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, 
SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, 
SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, 
SCHEMA-COLLATEFILTER.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, 
SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, 
SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT, 
SCHEMA-TRIE.XML, SOLRCONFIG_CODEC.XML, SCHEMA-TFIDF.XML, 
SCHEMA-LMJELINEKMERCER.XML, PHRASESUGGEST.TXT, 
SOLRCONFIG-BASIC-LUCENEVERSION31.XML, OLD_SYNONYMS.TXT, 
SOLRCONFIG-DELPOLICY2.XML, XSLT, SOLRCONFIG-NATIVELOCK.XML, 
BAD-SCHEMA-DUP-FIELD.XML, SOLRCONFIG-NOCACHE.XML, SCHEMA-BM25.XML, 
SOLRCONFIG-ALTDIRECTORY.XML, SOLRCONFIG-QUERYSENDER-NOQUERY.XML, 
COMPOUNDDICTIONARY.TXT, SOLRCONFIG_PERF.XML, 
SCHEMA-NOT-REQUIRED-UNIQUE-KEY.XML, KEEP-2.TXT, SCHEMA12.XML, 
MAPPING-ISOLATIN1ACCENT.TXT, BAD_SOLRCONFIG.XML, 
BAD-SCHEMA-EXTERNAL-FILEFIELD.XML]
   [junit4]   2 20834 T3143 oass.SolrIndexSearcher.init Opening 
Searcher@728679fd main
   [junit4]   2 20834 T3143 oass.SolrIndexSearcher.init WARNING WARNING: 
Directory impl does not support setting indexDir: 
org.apache.lucene.store.MockDirectoryWrapper
   [junit4]   2 20834 T3143 oasu.CommitTracker.init Hard AutoCommit: disabled
   [junit4]   2 20835 T3143 oasu.CommitTracker.init Soft AutoCommit: disabled
   [junit4]   2 20835 T3143 oashc.SpellCheckComponent.inform Initializing 
spell checkers
   [junit4]   2 20845 T3143 oass.DirectSolrSpellChecker.init init: 
{name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3}
   [junit4]   2 20895 T3143 oashc.HttpShardHandlerFactory.getParameter Setting 
socketTimeout to: 0
   [junit4]   2 20895 T3143 oashc.HttpShardHandlerFactory.getParameter Setting 
urlScheme to: http://
   [junit4]   2 20895 T3143 

[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281442#comment-13281442
 ] 

Dawid Weiss commented on LUCENE-4062:
-

You didn't attach the updated benchmark -- I didn't say it explicitly but you 
should do something with the resulting value (jit optimizer is quite smart ;). 
A field store (write the result to a field) should do the trick. So is 
System.out.println of course...

All this may sound paranoid but really isn't. This is a source of many problems 
with microbenchmarks -- the compiler just throws away (or optimizes loops/ 
branches) in a way that doesn't happen later on in real code. My recent 
favorite example of such a problem in real life code (it's a bug in jdk) is 
this one:

http://hg.openjdk.java.net/jdk8/tl/jdk/rev/332bebb463d1

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary 

[HEADS-UP]: Index File Format Change on Trunk

2012-05-23 Thread Simon Willnauer
Hey folks,

I just committed LUCENE-4051 [1] (Revision 1341768) which changes the
file format of DocValues, Norms (DocValues), StoredFields 
TermVectors incompatible to previous revisions. If you are using trunk
indices you must re-index before updating  to the latest trunk
sources.
If you are using Lucene 3.x or below you can safely ignore this message.

happy indexing,

simon

[1] https://issues.apache.org/jira/browse/LUCENE-4051

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4051) Fix File Headers for Lucene40 StoredFields TermVectors

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-4051.
-

Resolution: Fixed

committed to trunk in rev. 1341768. I send out a headsup mail to the dev list 
since this breaks the index file format.

thanks for reviewing lets get 4.0-alpha out!

 Fix File Headers for Lucene40 StoredFields  TermVectors
 

 Key: LUCENE-4051
 URL: https://issues.apache.org/jira/browse/LUCENE-4051
 Project: Lucene - Java
  Issue Type: Task
  Components: core/codecs
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-4051.patch, LUCENE-4051.patch, LUCENE-4051.patch, 
 LUCENE-4051.patch, LUCENE-4051.patch


 Currently we still write the old file header format in 
 Lucene40StoredFieldFormat  Lucene40TermVectorsFormat. We should cut over to 
 use CodecUtil and reset the versioning before we release Lucene 4.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #163

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/163/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: Lucene-Solr-trunk-Linux-Java6-64 #470

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/470/

--
[...truncated 4 lines...]
   [junit4]
   [junit4] Completed on J0 in 155.73s, 1 test, 1 failure  FAILURES!
   [junit4]  
   [junit4] Suite: org.apache.solr.EchoParamsTest
   [junit4] Completed on J0 in 0.11s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest
   [junit4] Completed on J0 in 0.80s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.util.TestUtils
   [junit4] Completed on J0 in 0.01s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.admin.LukeRequestHandlerTest
   [junit4] Completed on J0 in 1.54s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.similarities.TestBM25SimilarityFactory
   [junit4] Completed on J0 in 0.08s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.TestCSVLoader
   [junit4] Completed on J0 in 0.75s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.update.UpdateParamsTest
   [junit4] Completed on J0 in 0.55s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestFinnishLightStemFilterFactory
   [junit4] Completed on J0 in 0.01s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedTermsComponentTest
   [junit4] Completed on J0 in 5.54s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestCJKBigramFilterFactory
   [junit4] Completed on J0 in 0.01s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestTrimFilterFactory
   [junit4] Completed on J0 in 0.00s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.response.TestCSVResponseWriter
   [junit4] Completed on J0 in 0.52s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.TestJoin
   [junit4] Completed on J1 in 40.93s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.request.TestFaceting
   [junit4] Completed on J0 in 8.58s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.TestHashPartitioner
   [junit4] Completed on J1 in 5.42s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.QueryElevationComponentTest
   [junit4] Completed on J0 in 3.41s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.TestFunctionQuery
   [junit4] Completed on J0 in 1.89s, 14 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.request.SimpleFacetsTest
   [junit4] Completed on J1 in 3.37s, 29 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterFSTTest
   [junit4] Completed on J0 in 0.78s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest
   [junit4] Completed on J0 in 0.56s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTest
   [junit4] Completed on J0 in 0.77s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.SolrCoreTest
   [junit4] Completed on J1 in 3.05s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.BasicFunctionalityTest
   [junit4] IGNORED 0.00s J0 | BasicFunctionalityTest.testDeepPaging
   [junit4] Cause: Annotated @Ignore(See SOLR-1726)
   [junit4] Completed on J0 in 1.59s, 23 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestCoreContainer
   [junit4] Completed on J1 in 1.48s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest
   [junit4] Completed on J0 in 1.18s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.CopyFieldTest
   [junit4] Completed on J0 in 0.41s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTSTTest
   [junit4] Completed on J1 in 0.66s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.RequestHandlersTest
   [junit4] Completed on J0 in 0.54s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.highlight.FastVectorHighlighterTest
   [junit4] Completed on J1 in 0.57s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.XmlUpdateRequestHandlerTest
   [junit4] Completed on J0 in 0.50s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestQueryTypes
   [junit4] Completed on J1 in 0.47s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestReversedWildcardFilterFactory
   [junit4] Completed on J0 in 0.40s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.PrimitiveFieldTypeTest
   [junit4] Completed on J1 in 0.72s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.response.TestPHPSerializedResponseWriter
   [junit4] Completed on J0 in 0.50s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.DisMaxRequestHandlerTest
   [junit4] Completed on J1 in 0.60s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.RequiredFieldsTest
   [junit4] Completed on J0 in 0.49s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.IndexReaderFactoryTest
   [junit4] Completed on J1 in 0.47s, 1 test
   

Jenkins build is back to normal : Lucene-Solr-trunk-Linux-Java6-64 #471

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/471/changes


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281460#comment-13281460
 ] 

Adrien Grand commented on LUCENE-4062:
--

Hi David. Thanks for the link, it's very interesting!

I added a print statement to make sure that the sum is actually computed. Here 
is the code (for values of n  valueCount, just modify the k loop):

{code}
int valueCount = 1000;
int bitsPerValue = 21;
int[] offsets = new int[valueCount];
Random random = new Random();
for (int i = 0; i  valueCount; ++i) {
  offsets[i] = random.nextInt(valueCount);
}
byte[] bytes = new byte[valueCount * 4];
DataOutput out = new ByteArrayDataOutput(bytes);
PackedInts.Writer writer = PackedInts.getWriter(out, valueCount, bitsPerValue);
for (int i = 0; i  valueCount; ++i) {
  writer.add(random.nextInt(1  bitsPerValue));
}
writer.finish();
long sum = 0L;
for (int i = 0; i  50; ++i) {
  long start = System.nanoTime();
  DataInput in = new ByteArrayDataInput(bytes);
  // PackedInts.Reader reader = PackedInts.getReader(in, 0f); // Packed64
  PackedInts.Reader reader = PackedInts.getReader(in, 0.1f); // 
Packed64SingleBlock
  for (int k = 0; k  1; ++k) {
  for (int j = 0, n = valueCount / 2; j  n; ++j) {
sum += reader.get(offsets[j]);
  }
  }
  long end = System.nanoTime();
  System.out.println(sum is  + sum);
  System.out.println(end - start);
}
{code}

I'm on a different computer today and n = valueCount/3 is enough to make the 
benchmark faster with Packed64SingleBlock.

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 

N-Gram Threshould

2012-05-23 Thread parkhekishor
Hi, 
I made n-gram analyzer, but I am not able to set threshold during searching
corresponding to index.please help me.
 

-
REACH YOUR GOAL BEFORE GOAL KICKS YOU.

Thanks.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/N-Gram-Threshould-tp3985614.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #96

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/96/changes


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2614) stats with pivot

2012-05-23 Thread Marek Woroniecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281478#comment-13281478
 ] 

Marek Woroniecki commented on SOLR-2614:


I would also like to see this feature.

In our app we do a lot of faceting to allow users to drill down into data by 
selecting particular values per particular fields. We also calculate stats for 
these selections using stat component. However, quite often we need to group 
documents into chunks by their common attributes and then calculate stats as 
well. In classic approach using database we would probably do that with the 
group by phrase and use some aggregating functions. Unfortunately for some 
reasons this is not an easy option in our case, and we are left with either 
reading all the documents and calculating grouping in memory, or our users have 
to extract all the data to csv and do some pivots / stats in excel. 

I would be more than happy to implement this patch, if I only knew more about 
how Lucene / Solr works internally :(

 stats with pivot
 

 Key: SOLR-2614
 URL: https://issues.apache.org/jira/browse/SOLR-2614
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.0
Reporter: pengyao
Priority: Critical
 Fix For: 4.1


  Is it possible to get stats (like Stats Component: min ,max, sum, count,
 missing, sumOfSquares, mean and stddev) from numeric fields inside
 hierarchical facets (with more than one level, like Pivot)?
  I would like to query:
 ...?q=*:*version=2.2start=0rows=0stats=truestats.field=numeric_field1stats.field=numeric_field2stats.pivot=field_x,field_y,field_z
  and get min, max, sum, count, etc. from numeric_field1 and
 numeric_field2 from all combinations of field_x, field_y and field_z
 (hierarchical values).
  Using stats.facet I get just one field at one level and using
 facet.pivot I get just counts, but no stats.
  Looping in client application to do all combinations of facets values
 will be to slow because there is a lot of combinations.
  Thanks a lot!
 this  is  very  import,because  only counts value,it's no use for sometimes.
 please add   stats with pivot  in solr 4.0 
 thanks a lot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14280 - Failure

2012-05-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14280/

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=80 closes=78

Stack Trace:
java.lang.AssertionError: ERROR: SolrIndexSearcher opens=80 closes=78
at __randomizedtesting.SeedInfo.seed([7A9E536CED82AC23]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:190)
at org.apache.solr.SolrTestCaseJ4.afterClass(SolrTestCaseJ4.java:82)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:752)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 11378 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3476) Create a Solr Core with a given commit point

2012-05-23 Thread ludovic Boutros (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281484#comment-13281484
 ] 

ludovic Boutros commented on SOLR-3476:
---

some examples of usages :

- Create a new core with a given commit point generation :

bq. 
http://localhost:8084/solr/admin/cores?action=CREATEname=core4commitPointGeneration=4instanceDir=test

- Get the status of this core :

bq. http://localhost:8084/solr/admin/cores?action=STATUScore=core4

{code:xml}
response
  lst name=responseHeader
int name=status0/int
int name=QTime2692/int
  /lst
  lst name=status
lst name=core4
  str name=namecore4/str
  str name=instanceDirD:\temp\bases\testCores\test\/str
  str name=dataDirD:\temp\bases\testCores\test\data\/str
  date name=startTime2012-05-23T09:31:50.483Z/date
  long name=uptime149054/long
  long name=indexCommitGeneration4/long
  lst name=indexCommitList
long name=generation1/long
long name=generation2/long
long name=generation3/long
long name=generation4/long
long name=generation5/long
long name=generation6/long
long name=generation7/long
  /lst
  lst name=index
int name=numDocs3/int
int name=maxDoc3/int
long name=version1337759534761/long
int name=segmentCount3/int
bool name=currentfalse/bool
bool name=hasDeletionsfalse/bool
str 
name=directoryorg.apache.lucene.store.SimpleFSDirectory:org.apache.lucene.store.SimpleFSDirectory@D:\temp\bases\testCores\test\data\index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@1c24b45/str
date name=lastModified2012-05-23T09:22:10.713Z/date
  /lst
/lst
  /lst
/response
{code}

We can see the current commit point generation and the available commit point 
list.

- Now the solr.xml file :

{code:xml}
solr sharedLib=lib persistent=true
  cores adminPath=/admin/cores
core name=core4 instanceDir=test\ commitPointGeneration=4/
  /cores
/solr
{code}



 Create a Solr Core with a given commit point
 

 Key: SOLR-3476
 URL: https://issues.apache.org/jira/browse/SOLR-3476
 Project: Solr
  Issue Type: New Feature
  Components: multicore
Affects Versions: 3.6
Reporter: ludovic Boutros
 Attachments: commitPoint.patch


 In some configurations, we need to open new cores with a given commit point.
 For instance, when the publication of new documents must be controlled (legal 
 obligations) in a master-slave configuration there are two cores on the same 
 instanceDir and dataDir which are using two versions of the index.
 The switch of the two cores is done manually.
 The problem is that when the replication is done one day before the switch, 
 if any problem occurs, and we need to restart tomcat, the new documents are 
 published.
 With this functionality, we could ensure that the index generation used by 
 the core used for querying is always the good one. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3480) Refactor httpclient impl details into a utility class

2012-05-23 Thread Sami Siren (JIRA)
Sami Siren created SOLR-3480:


 Summary: Refactor httpclient impl details into a utility class
 Key: SOLR-3480
 URL: https://issues.apache.org/jira/browse/SOLR-3480
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, replication (java), SolrCloud
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor


Currently there are multiple classes that deal with the impl details of 
httpclient when setting timeouts, basic auth details, retry handling, 
compression etc. I am proposing that we instead move this functionality into a 
reusable utility class. 

The ultimate goal is to be able to easily use for example https or basic auth 
(that can already be used in some parts of solr) throughout solr but that will 
require some more work.

I will submit a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3480) Refactor httpclient impl details into a utility class

2012-05-23 Thread Sami Siren (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-3480:
-

Attachment: SOLR-3480.patch

 Refactor httpclient impl details into a utility class
 -

 Key: SOLR-3480
 URL: https://issues.apache.org/jira/browse/SOLR-3480
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, replication (java), SolrCloud
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor
 Attachments: SOLR-3480.patch


 Currently there are multiple classes that deal with the impl details of 
 httpclient when setting timeouts, basic auth details, retry handling, 
 compression etc. I am proposing that we instead move this functionality into 
 a reusable utility class. 
 The ultimate goal is to be able to easily use for example https or basic auth 
 (that can already be used in some parts of solr) throughout solr but that 
 will require some more work.
 I will submit a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3478) DataImportHandler's Entity must have a name

2012-05-23 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-3478:


Assignee: (was: Stefan Matheis (steffkes))

Ah okay! I opened this issue, Credits to [Emma, she reported this on the 
ML|http://lucene.472066.n3.nabble.com/Solr-mail-dataimporter-cannot-be-found-tc3985223.html]

James will you take care of this one and i'll remove my patch, because this 
should not be required, right?

 DataImportHandler's Entity must have a name
 ---

 Key: SOLR-3478
 URL: https://issues.apache.org/jira/browse/SOLR-3478
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
 Environment: r1341454, {code}java 
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar{code}
Reporter: Stefan Matheis (steffkes)
 Fix For: 4.0

 Attachments: SOLR-3478.patch


 Using trunk and trying to start the {{example-DIH}} version, throws the 
 following Exception:
 {code}May 22, 2012 8:17:45 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException
   at org.apache.solr.core.SolrCore.init(SolrCore.java:614)
   [...]
 Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
 Entity must have a name.
   at org.apache.solr.handler.dataimport.config.Entity.init(Entity.java:54)
   at 
 org.apache.solr.handler.dataimport.config.DIHConfiguration.init(DIHConfiguration.java:61)
   at 
 org.apache.solr.handler.dataimport.DataImporter.readFromXml(DataImporter.java:249)
   at 
 org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:187)
   ... 49 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281519#comment-13281519
 ] 

Simon Willnauer commented on LUCENE-2878:
-

hey folks,

due to heavy modifications on trunk I had almost no choice but creating a new 
branch and manually move over the changes via selective diffs. the branch is 
now here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878

the current state of the branch is: it compiles :)

lots of nocommits / todos and several tests failing due to not implemented 
stuff on new specialized boolean scorers. Happy coding everybody!
 

 Allow Scorer to expose positions and payloads aka. nuke spans 
 --

 Key: LUCENE-2878
 URL: https://issues.apache.org/jira/browse/LUCENE-2878
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: Positions Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: Positions Branch

 Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
 LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, 
 PosHighlighter.patch


 Currently we have two somewhat separate types of queries, the one which can 
 make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
 doesn't really do scoring comparable to what other queries do and at the end 
 of the day they are duplicating lot of code all over lucene. Span*Queries are 
 also limited to other Span*Query instances such that you can not use a 
 TermQuery or a BooleanQuery with SpanNear or anthing like that. 
 Beside of the Span*Query limitation other queries lacking a quiet interesting 
 feature since they can not score based on term proximity since scores doesn't 
 expose any positional information. All those problems bugged me for a while 
 now so I stared working on that using the bulkpostings API. I would have done 
 that first cut on trunk but TermScorer is working on BlockReader that do not 
 expose positions while the one in this branch does. I started adding a new 
 Positions class which users can pull from a scorer, to prevent unnecessary 
 positions enums I added ScorerContext#needsPositions and eventually 
 Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
 currently only TermQuery / TermScorer implements this API and other simply 
 return null instead. 
 To show that the API really works and our BulkPostings work fine too with 
 positions I cut over TermSpanQuery to use a TermScorer under the hood and 
 nuked TermSpans entirely. A nice sideeffect of this was that the Position 
 BulkReading implementation got some exercise which now :) work all with 
 positions while Payloads for bulkreading are kind of experimental in the 
 patch and those only work with Standard codec. 
 So all spans now work on top of TermScorer ( I truly hate spans since today ) 
 including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
 to implement the other codecs yet since I want to get feedback on the API and 
 on this first cut before I go one with it. I will upload the corresponding 
 patch in a minute. 
 I also had to cut over SpanQuery.getSpans(IR) to 
 SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
 first but after that pain today I need a break first :).
 The patch passes all core tests 
 (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
 look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Using term offsets for hit highlighting

2012-05-23 Thread Simon Willnauer
alan,

I merged the branch manually and created a new branch from it. its
here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878
the branch compiles but lots of nocommits / todos

if you have questions please ask I will help as much as I can

simon

On Tue, May 22, 2012 at 8:38 PM, Alan Woodward
alan.woodw...@romseysoftware.co.uk wrote:
 Hey, I reckon I can have a decent go at getting the branch updated.  Is it 
 best to work this out as a patch applying to trunk?  Any patch that merges in 
 all the trunk changes to the branch is going to be absolutely massive…

 On 17 May 2012, at 13:15, Simon Willnauer wrote:

 ok man. I will try to merge up the branch. I tell you this is going to
 be messy and it might not compile but I will make it reasonable so you
 can start.

 simon

 On Thu, May 17, 2012 at 8:03 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sorry for vanishing for so long, life unexpectedly caught up with me...  
 I'm going to have some time to look at this again next week though, if 
 you're interested in picking it up again.

 On 21 Mar 2012, at 09:02, Alan Woodward wrote:

 That would be great, thanks!  I had a go at merging it last night, but 
 there are a *lot* of changes that I haven't got my head round yet, so it 
 was getting pretty messy.

 On 21 Mar 2012, at 08:49, Simon Willnauer wrote:

 Alan, if you want I can just merge the branch up next week and we
 iterate from there?

 simon

 On Tue, Mar 20, 2012 at 12:34 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 Yep, the first challenge is always getting the old patch(es) to 
 apply.

 On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Thanks for all the offers of help!  It looks as though most of the hard 
 work has already been done, which is exactly where I like to pick up 
 projects.  :-)

 Maybe the best place to start would be for me to rebase the branch 
 against trunk, and see what still fits?  I think there have been some 
 fairly major changes in the internals since July last year.

 On 19 Mar 2012, at 17:07, Mike Sokolov wrote:

 I posted a patch with a Collector somewhat similar to what you 
 described, Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a fairly 
 complete alpha state, but has seen no production use of course, 
 since it relies on the remainder of the unfinished work in that 
 branch.  It works by creating a TokenStream based on match positions 
 returned from the query and passing that to the existing Highlighter.  
 Please feel free to get in touch if you decide to look into that and 
 have questions.


 -Mike

 On 03/19/2012 11:51 AM, Simon Willnauer wrote:
 On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindleru...@thetaphi.de  
 wrote:

 Have you marked that for GSOC? Would be a good idea!

 yes I did

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Monday, March 19, 2012 4:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Using term offsets for hit highlighting

 Alan, you made my day!

 The branch is kind of outdated but I looked at it lately and I can 
 certainly help
 to get it up to speed. The feature in that branch is quite a big 
 one and its in a
 very early stage. Still I want to encourage you to take a look and 
 work on it. I
 promise all my help with the issues!

 let me know if you have questions!

 simon

 On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:

 Cool, thanks Robert.  I'll take a look at the JIRA ticket.

 On 19 Mar 2012, at 14:44, Robert Muir wrote:


 On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:

 Hello,

 The project I'm currently working on requires the reporting of 
 exact
 hit positions from some pretty hairy queries, not all of which 
 are
 covered by the existing highlighter modules.  I'm working round 
 this
 by translating everything into SpanQueries, and using the 
 getSpans()
 method to locate hits (I've extended the Spans interface to make
 term offsets available - see
 https://issues.apache.org/jira/browse/LUCENE-3826).  This works 
 for
 our use-case, but isn't terribly efficient, and obviously isn't 
 applicable to

 non-Span queries.

 I've seen a bit of chatter on the list about using term offsets 
 to
 provide accurate highlighting in Lucene.  I'm going to have a 
 couple
 of weeks free in April, and I thought I might have a go at
 implementing this.  Mainly I'm wondering if there's already been
 thoughts about how to do it.  My current thoughts are to somehow
 extend the Weight and Scorer interface to make term offsets
 available; to get highlights for a given set of documents, you'd
 essentially run the query again, with a filter on just the 
 documents
 you want highlighted, and have a 

Re: Using term offsets for hit highlighting

2012-05-23 Thread Alan Woodward
Sweet, thanks Simon.  I'll have a go at getting some failing tests passing to 
begin with.

On 23 May 2012, at 11:59, Simon Willnauer wrote:

 alan,
 
 I merged the branch manually and created a new branch from it. its
 here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878
 the branch compiles but lots of nocommits / todos
 
 if you have questions please ask I will help as much as I can
 
 simon
 
 On Tue, May 22, 2012 at 8:38 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Hey, I reckon I can have a decent go at getting the branch updated.  Is it 
 best to work this out as a patch applying to trunk?  Any patch that merges 
 in all the trunk changes to the branch is going to be absolutely massive…
 
 On 17 May 2012, at 13:15, Simon Willnauer wrote:
 
 ok man. I will try to merge up the branch. I tell you this is going to
 be messy and it might not compile but I will make it reasonable so you
 can start.
 
 simon
 
 On Thu, May 17, 2012 at 8:03 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sorry for vanishing for so long, life unexpectedly caught up with me...  
 I'm going to have some time to look at this again next week though, if 
 you're interested in picking it up again.
 
 On 21 Mar 2012, at 09:02, Alan Woodward wrote:
 
 That would be great, thanks!  I had a go at merging it last night, but 
 there are a *lot* of changes that I haven't got my head round yet, so it 
 was getting pretty messy.
 
 On 21 Mar 2012, at 08:49, Simon Willnauer wrote:
 
 Alan, if you want I can just merge the branch up next week and we
 iterate from there?
 
 simon
 
 On Tue, Mar 20, 2012 at 12:34 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 Yep, the first challenge is always getting the old patch(es) to 
 apply.
 
 On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Thanks for all the offers of help!  It looks as though most of the 
 hard work has already been done, which is exactly where I like to pick 
 up projects.  :-)
 
 Maybe the best place to start would be for me to rebase the branch 
 against trunk, and see what still fits?  I think there have been some 
 fairly major changes in the internals since July last year.
 
 On 19 Mar 2012, at 17:07, Mike Sokolov wrote:
 
 I posted a patch with a Collector somewhat similar to what you 
 described, Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a 
 fairly complete alpha state, but has seen no production use of 
 course, since it relies on the remainder of the unfinished work in 
 that branch.  It works by creating a TokenStream based on match 
 positions returned from the query and passing that to the existing 
 Highlighter.  Please feel free to get in touch if you decide to look 
 into that and have questions.
 
 
 -Mike
 
 On 03/19/2012 11:51 AM, Simon Willnauer wrote:
 On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindleru...@thetaphi.de  
 wrote:
 
 Have you marked that for GSOC? Would be a good idea!
 
 yes I did
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Monday, March 19, 2012 4:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Using term offsets for hit highlighting
 
 Alan, you made my day!
 
 The branch is kind of outdated but I looked at it lately and I can 
 certainly help
 to get it up to speed. The feature in that branch is quite a big 
 one and its in a
 very early stage. Still I want to encourage you to take a look and 
 work on it. I
 promise all my help with the issues!
 
 let me know if you have questions!
 
 simon
 
 On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:
 
 Cool, thanks Robert.  I'll take a look at the JIRA ticket.
 
 On 19 Mar 2012, at 14:44, Robert Muir wrote:
 
 
 On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:
 
 Hello,
 
 The project I'm currently working on requires the reporting of 
 exact
 hit positions from some pretty hairy queries, not all of which 
 are
 covered by the existing highlighter modules.  I'm working round 
 this
 by translating everything into SpanQueries, and using the 
 getSpans()
 method to locate hits (I've extended the Spans interface to make
 term offsets available - see
 https://issues.apache.org/jira/browse/LUCENE-3826).  This works 
 for
 our use-case, but isn't terribly efficient, and obviously isn't 
 applicable to
 
 non-Span queries.
 
 I've seen a bit of chatter on the list about using term offsets 
 to
 provide accurate highlighting in Lucene.  I'm going to have a 
 couple
 of weeks free in April, and I thought I might have a go at
 implementing this.  Mainly I'm wondering if there's already been
 thoughts about how to do it.  My current thoughts are to somehow
 extend the Weight and Scorer interface to 

Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #97

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/97/

--
[...truncated 10994 lines...]
   [junit4]  
   [junit4] Suite: org.apache.solr.internal.csv.CSVPrinterTest
   [junit4] Completed in 0.71s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy2
   [junit4] Completed in 1.03s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.BasicDistributedZkTest
   [junit4] Completed in 61.64s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.BasicZkTest
   [junit4] Completed in 12.07s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.ZkControllerTest
   [junit4] Completed in 22.53s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.TestHashPartitioner
   [junit4] Completed in 8.15s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.request.SimpleFacetsTest
   [junit4] Completed in 8.55s, 29 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest
   [junit4] Completed in 1.26s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.ConvertedLegacyTest
   [junit4] Completed in 3.86s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestJmxIntegration
   [junit4] IGNORED 0.00s | TestJmxIntegration.testJmxOnCoreReload
   [junit4] Cause: Annotated @Ignore(timing problem? 
https://issues.apache.org/jira/browse/SOLR-2715)
   [junit4] Completed in 1.99s, 3 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.servlet.SolrRequestParserTest
   [junit4] Completed in 1.70s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest
   [junit4] Completed in 1.04s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTest
   [junit4] Completed in 1.51s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.BasicFunctionalityTest
   [junit4] IGNORED 0.00s | BasicFunctionalityTest.testDeepPaging
   [junit4] Cause: Annotated @Ignore(See SOLR-1726)
   [junit4] Completed in 2.95s, 23 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.update.SolrCmdDistributorTest
   [junit4] Completed in 2.55s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.IndexBasedSpellCheckerTest
   [junit4] Completed in 1.62s, 5 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest
   [junit4] Completed in 1.45s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestCoreContainer
   [junit4] Completed in 2.67s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.CSVRequestHandlerTest
   [junit4] Completed in 0.88s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest
   [junit4] Completed in 1.78s, 2 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactoryTest
   [junit4] Completed in 0.74s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.admin.CoreAdminHandlerTest
   [junit4] Completed in 1.71s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.SpellCheckCollatorTest
   [junit4] Completed in 1.37s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTSTTest
   [junit4] Completed in 1.03s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.request.TestBinaryResponseWriter
   [junit4] Completed in 1.32s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.servlet.NoCacheHeaderTest
   [junit4] Completed in 0.81s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.servlet.CacheHeaderTest
   [junit4] Completed in 0.82s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestPropInject
   [junit4] Completed in 1.45s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.CopyFieldTest
   [junit4] Completed in 0.57s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy1
   [junit4] IGNOR/A 0.02s | TestSolrDeletionPolicy1.testCommitAge
   [junit4] Assumption #1: This test is not working on Windows (or maybe 
machines with only 2 CPUs)
   [junit4]   2 780 T3476 oas.SolrTestCaseJ4.setUp ###Starting testCommitAge
   [junit4]   2 ASYNC  NEW_CORE C45 name=collection1 
org.apache.solr.core.SolrCore@3e9f3371
   [junit4]   2 784 T3476 C45 oasu.DirectUpdateHandler2.deleteAll 
[collection1] REMOVING ALL DOCUMENTS FROM INDEX
   [junit4]   2 787 T3476 C45 oasc.SolrDeletionPolicy.onInit 
SolrDeletionPolicy.onInit: commits:num=1
   [junit4]   2
commit{dir=MockDirWrapper(org.apache.lucene.store.RAMDirectory@38f9f930 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@1b67117f),segFN=segments_1,generation=1,filenames=[segments_1]
   [junit4]   2 787 T3476 C45 oasc.SolrDeletionPolicy.updateCommits newest 
commit = 1
   [junit4]   2 787 T3476 C45 UPDATE [collection1] webapp=null path=null 
params={} {deleteByQuery=*:*} 0 3
   [junit4]   2 792 T3476 

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14283 - Failure

2012-05-23 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14283/

1 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.handler.TestReplicationHandler

Error Message:
ERROR: SolrIndexSearcher opens=74 closes=73

Stack Trace:
java.lang.AssertionError: ERROR: SolrIndexSearcher opens=74 closes=73
at __randomizedtesting.SeedInfo.seed([2D7514737EA2DD02]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:190)
at org.apache.solr.SolrTestCaseJ4.afterClass(SolrTestCaseJ4.java:82)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:752)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at 
org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
at 
org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)




Build Log (for compile errors):
[...truncated 10346 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3476) Create a Solr Core with a given commit point

2012-05-23 Thread ludovic Boutros (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludovic Boutros updated SOLR-3476:
--

Issue Type: Improvement  (was: New Feature)

 Create a Solr Core with a given commit point
 

 Key: SOLR-3476
 URL: https://issues.apache.org/jira/browse/SOLR-3476
 Project: Solr
  Issue Type: Improvement
  Components: multicore
Affects Versions: 3.6
Reporter: ludovic Boutros
 Attachments: commitPoint.patch


 In some configurations, we need to open new cores with a given commit point.
 For instance, when the publication of new documents must be controlled (legal 
 obligations) in a master-slave configuration there are two cores on the same 
 instanceDir and dataDir which are using two versions of the index.
 The switch of the two cores is done manually.
 The problem is that when the replication is done one day before the switch, 
 if any problem occurs, and we need to restart tomcat, the new documents are 
 published.
 With this functionality, we could ensure that the index generation used by 
 the core used for querying is always the good one. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Using term offsets for hit highlighting

2012-05-23 Thread Simon Willnauer
hey alan,

I added position iterator support to ConjunctionTermScorer and
committed it to the branch. All tests that don't rely on payloads are
passing in core. Previously we had to decide if we need positions up
front, the current code can pull them lazily which causes less changes
on the Scorer API. I think we should keep it that way, the only
problem is that we have currently now way to pass information to the
iterators if we need payloads or not. Same is true for offsets since
they are now in the index. I think it would be good if you could
tackle the payloads first and pass some info to the Scorer#positions()
method so we can pull the right thing.

happy coding.

simon

On Wed, May 23, 2012 at 1:23 PM, Alan Woodward
alan.woodw...@romseysoftware.co.uk wrote:
 Sweet, thanks Simon.  I'll have a go at getting some failing tests passing to 
 begin with.

 On 23 May 2012, at 11:59, Simon Willnauer wrote:

 alan,

 I merged the branch manually and created a new branch from it. its
 here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878
 the branch compiles but lots of nocommits / todos

 if you have questions please ask I will help as much as I can

 simon

 On Tue, May 22, 2012 at 8:38 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Hey, I reckon I can have a decent go at getting the branch updated.  Is it 
 best to work this out as a patch applying to trunk?  Any patch that merges 
 in all the trunk changes to the branch is going to be absolutely massive…

 On 17 May 2012, at 13:15, Simon Willnauer wrote:

 ok man. I will try to merge up the branch. I tell you this is going to
 be messy and it might not compile but I will make it reasonable so you
 can start.

 simon

 On Thu, May 17, 2012 at 8:03 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sorry for vanishing for so long, life unexpectedly caught up with me...  
 I'm going to have some time to look at this again next week though, if 
 you're interested in picking it up again.

 On 21 Mar 2012, at 09:02, Alan Woodward wrote:

 That would be great, thanks!  I had a go at merging it last night, but 
 there are a *lot* of changes that I haven't got my head round yet, so it 
 was getting pretty messy.

 On 21 Mar 2012, at 08:49, Simon Willnauer wrote:

 Alan, if you want I can just merge the branch up next week and we
 iterate from there?

 simon

 On Tue, Mar 20, 2012 at 12:34 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 Yep, the first challenge is always getting the old patch(es) to 
 apply.

 On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Thanks for all the offers of help!  It looks as though most of the 
 hard work has already been done, which is exactly where I like to 
 pick up projects.  :-)

 Maybe the best place to start would be for me to rebase the branch 
 against trunk, and see what still fits?  I think there have been some 
 fairly major changes in the internals since July last year.

 On 19 Mar 2012, at 17:07, Mike Sokolov wrote:

 I posted a patch with a Collector somewhat similar to what you 
 described, Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a 
 fairly complete alpha state, but has seen no production use of 
 course, since it relies on the remainder of the unfinished work in 
 that branch.  It works by creating a TokenStream based on match 
 positions returned from the query and passing that to the existing 
 Highlighter.  Please feel free to get in touch if you decide to look 
 into that and have questions.


 -Mike

 On 03/19/2012 11:51 AM, Simon Willnauer wrote:
 On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindleru...@thetaphi.de  
 wrote:

 Have you marked that for GSOC? Would be a good idea!

 yes I did

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Monday, March 19, 2012 4:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Using term offsets for hit highlighting

 Alan, you made my day!

 The branch is kind of outdated but I looked at it lately and I 
 can certainly help
 to get it up to speed. The feature in that branch is quite a big 
 one and its in a
 very early stage. Still I want to encourage you to take a look 
 and work on it. I
 promise all my help with the issues!

 let me know if you have questions!

 simon

 On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:

 Cool, thanks Robert.  I'll take a look at the JIRA ticket.

 On 19 Mar 2012, at 14:44, Robert Muir wrote:


 On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:

 Hello,

 The project I'm currently working on requires the reporting of 
 exact
 hit positions from some pretty hairy queries, not all of which 
 are
 covered by the existing highlighter modules.  I'm 

[jira] [Created] (SOLR-3481) Date field value differs between two installations

2012-05-23 Thread David Rekowski (JIRA)
David Rekowski created SOLR-3481:


 Summary: Date field value differs between two installations
 Key: SOLR-3481
 URL: https://issues.apache.org/jira/browse/SOLR-3481
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 3.6
 Environment: A. Mac 10.7.4 with integrated Jetty
B. Ubuntu 12.04 with Tomcat
Reporter: David Rekowski


When I query the Solr Server, I get a formatted timestamp in environment A 
2012-05-11T12:59:01.691Z, whereas I get a unix timestamp like number in 
environment B 1336728376797 which looks like the date extended by 
microseconds.

The corresponding schema definition:
   field name=index_time_s type=date indexed=true stored=true 
default=NOW multiValued=false/


Background: We migrated an index generated on a mac/jetty to a linux/tomcat 
installation of Solr. Regardless of that, this happens with newly indexed 
documents as well.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3238) Sequel of Admin UI

2012-05-23 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281607#comment-13281607
 ] 

Markus Jelsma commented on SOLR-3238:
-

I think it would also be useful to display the shard information in the core 
overview page such as its ID and whether it is a leader.

 Sequel of Admin UI
 --

 Key: SOLR-3238
 URL: https://issues.apache.org/jira/browse/SOLR-3238
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0

 Attachments: SOLR-3238.patch, SOLR-3238.patch, SOLR-3238.patch, 
 solradminbug.png


 Catch-All Issue for all upcoming Bugs/Reports/Suggestions on the Admin UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #98

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/98/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #168

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/168/

--
[...truncated 16649 lines...]
   [junit4]   2 62730 T2861 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60
   [junit4]   2 62732 T2861 oasc.SolrCore.initDeprecatedSupport WARNING 
solrconfig.xml uses deprecated admin/gettableFiles, Please update your config 
to use the ShowFileRequestHandler.
   [junit4]   2 62733 T2861 oasc.SolrCore.initDeprecatedSupport WARNING adding 
ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, 
SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, 
BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, 
SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, 
SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, 
SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, 
SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, 
SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, 
SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, 
HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, STEMDICT.TXT, 
SCHEMA-PHRASESUGGEST.XML, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, 
STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, 
SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, 
SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, 
SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, 
SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, 
OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, 
SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, 
BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, 
SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, 
SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, 
SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, 
SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, 
SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, 
SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, 
SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, 
SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, 
SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, 
SCHEMA-COLLATEFILTER.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, 
SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, 
SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT, 
SCHEMA-TRIE.XML, SOLRCONFIG_CODEC.XML, SCHEMA-TFIDF.XML, 
SCHEMA-LMJELINEKMERCER.XML, PHRASESUGGEST.TXT, 
SOLRCONFIG-BASIC-LUCENEVERSION31.XML, OLD_SYNONYMS.TXT, 
SOLRCONFIG-DELPOLICY2.XML, XSLT, SOLRCONFIG-NATIVELOCK.XML, 
BAD-SCHEMA-DUP-FIELD.XML, SOLRCONFIG-NOCACHE.XML, SCHEMA-BM25.XML, 
SOLRCONFIG-ALTDIRECTORY.XML, SOLRCONFIG-QUERYSENDER-NOQUERY.XML, 
COMPOUNDDICTIONARY.TXT, SOLRCONFIG_PERF.XML, 
SCHEMA-NOT-REQUIRED-UNIQUE-KEY.XML, KEEP-2.TXT, SCHEMA12.XML, 
MAPPING-ISOLATIN1ACCENT.TXT, BAD_SOLRCONFIG.XML, 
BAD-SCHEMA-EXTERNAL-FILEFIELD.XML]
   [junit4]   2 62737 T2861 oass.SolrIndexSearcher.init Opening 
Searcher@73f3e55 main
   [junit4]   2 62737 T2861 oass.SolrIndexSearcher.init WARNING WARNING: 
Directory impl does not support setting indexDir: 
org.apache.lucene.store.MockDirectoryWrapper
   [junit4]   2 62739 T2861 oasu.CommitTracker.init Hard AutoCommit: disabled
   [junit4]   2 62739 T2861 oasu.CommitTracker.init Soft AutoCommit: disabled
   [junit4]   2 62739 T2861 oashc.SpellCheckComponent.inform Initializing 
spell checkers
   [junit4]   2 62752 T2861 oass.DirectSolrSpellChecker.init init: 
{name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3}
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
socketTimeout to: 0
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
urlScheme to: http://
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
connTimeout to: 0
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
maxConnectionsPerHost to: 20
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
corePoolSize to: 0
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
maximumPoolSize to: 2147483647
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
maxThreadIdleTime to: 5
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
sizeOfQueue to: -1
   [junit4]   2 62825 T2861 oashc.HttpShardHandlerFactory.getParameter Setting 
fairnessPolicy to: false
   [junit4]   2 62841 T2861 oasc.CoreContainer.register registering core: 
collection1
   [junit4]   2 62842 T2861 oasu.AbstractSolrTestCase.setUp SETUP_END 
testSoftAndHardCommitMaxTimeMixedAdds
   [junit4]   2 62842 T2861 

[jira] [Comment Edited] (SOLR-3478) DataImportHandler's Entity must have a name

2012-05-23 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281135#comment-13281135
 ] 

James Dyer edited comment on SOLR-3478 at 5/23/12 2:45 PM:
---

Thanks for finding this one.  Looking at this issue, I'm pretty sure I 
introduced this bug with SOLR-3422.

  was (Author: jdyer):
Thanks for finding this one.  Looking at this issue, I'm pretty sure I 
introduced this bug with SOLR-3430.
  
 DataImportHandler's Entity must have a name
 ---

 Key: SOLR-3478
 URL: https://issues.apache.org/jira/browse/SOLR-3478
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
 Environment: r1341454, {code}java 
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar{code}
Reporter: Stefan Matheis (steffkes)
 Fix For: 4.0

 Attachments: SOLR-3478.patch


 Using trunk and trying to start the {{example-DIH}} version, throws the 
 following Exception:
 {code}May 22, 2012 8:17:45 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException
   at org.apache.solr.core.SolrCore.init(SolrCore.java:614)
   [...]
 Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
 Entity must have a name.
   at org.apache.solr.handler.dataimport.config.Entity.init(Entity.java:54)
   at 
 org.apache.solr.handler.dataimport.config.DIHConfiguration.init(DIHConfiguration.java:61)
   at 
 org.apache.solr.handler.dataimport.DataImporter.readFromXml(DataImporter.java:249)
   at 
 org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:187)
   ... 49 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3478) DataImportHandler's Entity must have a name

2012-05-23 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281642#comment-13281642
 ] 

James Dyer commented on SOLR-3478:
--

Actually I think for 4.0 we should break backwards-compatibility with this one 
and require all DIH entities to have a name.  (In 3.6 and prior, it logs a 
warning and assigns a name based on the system clock.)  In SOLR-3422 I fixed 
any unit tests that didn't use name but missed the examples.

 DataImportHandler's Entity must have a name
 ---

 Key: SOLR-3478
 URL: https://issues.apache.org/jira/browse/SOLR-3478
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
 Environment: r1341454, {code}java 
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar{code}
Reporter: Stefan Matheis (steffkes)
 Fix For: 4.0

 Attachments: SOLR-3478.patch


 Using trunk and trying to start the {{example-DIH}} version, throws the 
 following Exception:
 {code}May 22, 2012 8:17:45 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException
   at org.apache.solr.core.SolrCore.init(SolrCore.java:614)
   [...]
 Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
 Entity must have a name.
   at org.apache.solr.handler.dataimport.config.Entity.init(Entity.java:54)
   at 
 org.apache.solr.handler.dataimport.config.DIHConfiguration.init(DIHConfiguration.java:61)
   at 
 org.apache.solr.handler.dataimport.DataImporter.readFromXml(DataImporter.java:249)
   at 
 org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:187)
   ... 49 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2585) Context-Sensitive Spelling Suggestions Collations

2012-05-23 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-2585.
--

   Resolution: Fixed
Fix Version/s: 4.0
 Assignee: James Dyer

Committed to Trunk r1341894.  I will also add this to the wiki.

 Context-Sensitive Spelling Suggestions  Collations
 ---

 Key: SOLR-2585
 URL: https://issues.apache.org/jira/browse/SOLR-2585
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Affects Versions: 4.0
Reporter: James Dyer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2585.patch, SOLR-2585.patch, SOLR-2585.patch, 
 SOLR-2585.patch, SOLR-2585.patch, SOLR-2585.patch, SOLR-2585.patch, 
 SOLR-2585.patch, SOLR-2585.patch, SOLR-2585.patch


 Solr currently cannot offer what I'm calling here a context-sensitive 
 spelling suggestion.  That is, if a user enters one or more words that have 
 docFrequency  0, but nevertheless are misspelled, then no suggestions are 
 offered.  Currently, Solr will always consider a word correctly spelled if 
 it is in the index and/or dictionary, regardless of context.  This issue  
 patch add support for context-sensitive spelling suggestions. 
 See SpellCheckCollatorTest.testContextSensitiveCollate() for a the typical 
 use case for this functionality.  This tests both using 
 IndexBasedSepllChecker and DirectSolrSpellChecker. 
 Two new Spelling Parameters were added:
   - spellcheck.alternativeTermCount - The count of suggestions to return for 
 each query term existing in the index and/or dictionary.  Presumably, users 
 will want fewer suggestions for words with docFrequency0.  Also setting this 
 value turns on context-sensitive spell suggestions. 
   - spellcheck.maxResultsForSuggest - The maximum number of hits the request 
 can return in order to both generate spelling suggestions and set the 
 correctlySpelled element to false.  For example, if this is set to 5 and 
 the user's query returns 5 or fewer results, the spellchecker will report 
 correctlySpelled=false and also offer suggestions (and collations if 
 requested).  Setting this greater than zero is useful for creating 
 did-you-mean suggestions for queries that return a low number of hits.
 I have also included a test using shards.  See additions to 
 DistributedSpellCheckComponentTest. 
 In Lucene, SpellChecker.java can already support this functionality (by 
 passing a null IndexReader and field-name).  The DirectSpellChecker, however, 
 needs a minor enhancement.  This gives the option to allow DirectSpellChecker 
 to return suggestions for all query terms regardless of frequency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3309) Slow WAR startups due to annotation scaning (affects Jetty 8)

2012-05-23 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-3309.
--

Resolution: Fixed

Committed the javaee change, Trunk r1341897.

 Slow WAR startups due to annotation scaning (affects Jetty 8)
 -

 Key: SOLR-3309
 URL: https://issues.apache.org/jira/browse/SOLR-3309
 Project: Solr
  Issue Type: Bug
Reporter: Bill Bell
Assignee: James Dyer
 Fix For: 4.0

 Attachments: SOLR-3309.patch, SOLR-3309.patch


 Need to modify web.xml to increase the speed of container startup time. The 
 header also appears to need to be modified...
 http://mostlywheat.wordpress.com/2012/03/10/speeding-up-slow-jetty-8-startups/
 http://www.javabeat.net/articles/print.php?article_id=100
 Adding 'metadata-complete=true' to our web.xml's web-app restored our 
 startup time to 8 seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3457) Spellchecker always incorrectly spelled

2012-05-23 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-3457.
--

   Resolution: Fixed
Fix Version/s: 4.0
 Assignee: James Dyer

Fixed with SOLR-2585 commit (Trunk r1341894).

 Spellchecker always incorrectly spelled
 ---

 Key: SOLR-3457
 URL: https://issues.apache.org/jira/browse/SOLR-3457
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.0
 Environment: solr-spec 4.0.0.2012.05.15.11.42.06
 solr-impl 4.0-SNAPSHOT 1338601 - markus - 2012-05-15 11:42:06
 lucene-spec 4.0-SNAPSHOT
 lucene-impl 4.0-SNAPSHOT 1338601 - markus - 2012-05-15 10:51:02
Reporter: Markus Jelsma
Assignee: James Dyer
 Fix For: 4.0

 Attachments: SOLR-3457-4.0-1.patch


 correctlySpelled is always false with default configuration, example config 
 and example documents:
 http://localhost:8983/solr/collection1/browse?wt=xmlspellcheck.extendedResults=trueq=samsung
 {code}
 lst name=spellcheck
   lst name=suggestions
bool name=correctlySpelledfalse/bool
   /lst
 /lst
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB

2012-05-23 Thread Simon Willnauer (JIRA)
Simon Willnauer created LUCENE-4074:
---

 Summary: FST Sorter BufferSize causes int overflow if BufferSize  
2048MB
 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
 Fix For: 3.6.1, 4.1


the BufferSize constructor accepts size in MB as an integer and uses 
multiplication to convert to bytes. While its checking the size in bytes to be 
less than 2048 MB it does that after byte conversion. If you pass a value  
2047 to the ctor the value overflows since all constants and methods based on 
MB expect 32 bit signed ints. This does not even result in an exception until 
the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3478) DataImportHandler's Entity must have a name

2012-05-23 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-3478:


Assignee: Stefan Matheis (steffkes)

okay, now it's clear for me. will commit the changed example soon

 DataImportHandler's Entity must have a name
 ---

 Key: SOLR-3478
 URL: https://issues.apache.org/jira/browse/SOLR-3478
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
 Environment: r1341454, {code}java 
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar{code}
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0

 Attachments: SOLR-3478.patch


 Using trunk and trying to start the {{example-DIH}} version, throws the 
 following Exception:
 {code}May 22, 2012 8:17:45 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException
   at org.apache.solr.core.SolrCore.init(SolrCore.java:614)
   [...]
 Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
 Entity must have a name.
   at org.apache.solr.handler.dataimport.config.Entity.init(Entity.java:54)
   at 
 org.apache.solr.handler.dataimport.config.DIHConfiguration.init(DIHConfiguration.java:61)
   at 
 org.apache.solr.handler.dataimport.DataImporter.readFromXml(DataImporter.java:249)
   at 
 org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:187)
   ... 49 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4074:


Attachment: LUCENE-4074.patch

here is a patch that adds a testcase, changes all arguments and constants to 
64bit signed ints and checks for negative values in the BufferSize ctor for 
immediate feedback.

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
 Fix For: 3.6.1, 4.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-4074:
---

Assignee: Simon Willnauer

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 3.6.1, 4.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3478) DataImportHandler's Entity must have a name

2012-05-23 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) resolved SOLR-3478.
-

Resolution: Fixed

Committed in r1341920

 DataImportHandler's Entity must have a name
 ---

 Key: SOLR-3478
 URL: https://issues.apache.org/jira/browse/SOLR-3478
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
 Environment: r1341454, {code}java 
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar{code}
Reporter: Stefan Matheis (steffkes)
Assignee: Stefan Matheis (steffkes)
 Fix For: 4.0

 Attachments: SOLR-3478.patch


 Using trunk and trying to start the {{example-DIH}} version, throws the 
 following Exception:
 {code}May 22, 2012 8:17:45 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException
   at org.apache.solr.core.SolrCore.init(SolrCore.java:614)
   [...]
 Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
 Entity must have a name.
   at org.apache.solr.handler.dataimport.config.Entity.init(Entity.java:54)
   at 
 org.apache.solr.handler.dataimport.config.DIHConfiguration.init(DIHConfiguration.java:61)
   at 
 org.apache.solr.handler.dataimport.DataImporter.readFromXml(DataImporter.java:249)
   at 
 org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:187)
   ... 49 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #169

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/169/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Resolved] (LUCENE-4051) Fix File Headers for Lucene40 StoredFields TermVectors

2012-05-23 Thread Ryan McKinley
For 4.0-alpha, are there other known file format changes in the works?



 committed to trunk in rev. 1341768. I send out a headsup mail to the dev list 
 since this breaks the index file format.

 thanks for reviewing lets get 4.0-alpha out!


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Resolved] (LUCENE-4051) Fix File Headers for Lucene40 StoredFields TermVectors

2012-05-23 Thread Robert Muir
yes

On Wed, May 23, 2012 at 12:14 PM, Ryan McKinley ryan...@gmail.com wrote:
 For 4.0-alpha, are there other known file format changes in the works?



 committed to trunk in rev. 1341768. I send out a headsup mail to the dev 
 list since this breaks the index file format.

 thanks for reviewing lets get 4.0-alpha out!


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4051) Fix File Headers for Lucene40 StoredFields TermVectors

2012-05-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281700#comment-13281700
 ] 

Uwe Schindler commented on LUCENE-4051:
---

Thank you very much! Now file formats are finally consistent. Maybe our index 
files' consistent magic numbers now also get added to the file unix command 
:-)

 Fix File Headers for Lucene40 StoredFields  TermVectors
 

 Key: LUCENE-4051
 URL: https://issues.apache.org/jira/browse/LUCENE-4051
 Project: Lucene - Java
  Issue Type: Task
  Components: core/codecs
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-4051.patch, LUCENE-4051.patch, LUCENE-4051.patch, 
 LUCENE-4051.patch, LUCENE-4051.patch


 Currently we still write the old file header format in 
 Lucene40StoredFieldFormat  Lucene40TermVectorsFormat. We should cut over to 
 use CodecUtil and reset the versioning before we release Lucene 4.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-2161) BasicDistributedZkTest.testDistribSearch test failure

2012-05-23 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley closed SOLR-2161.
--

   Resolution: Fixed
Fix Version/s: (was: 4.1)
   4.0

I checked in some changes that should fix the most common failures we were 
seeing due to the recovery threads not being stopped.

 BasicDistributedZkTest.testDistribSearch test failure
 -

 Key: SOLR-2161
 URL: https://issues.apache.org/jira/browse/SOLR-2161
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
 Environment: Hudson
Reporter: Robert Muir
 Fix For: 4.0


 BasicDistributedZkTest.testDistribSearch failed in Hudson.
 Here is the stacktrace:
 {noformat}
 [junit] Testsuite: org.apache.solr.cloud.BasicDistributedZkTest
 [junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.BasicDistributedZkTest):
 Caused an ERROR
 [junit] Error executing query
 [junit] org.apache.solr.client.solrj.SolrServerException: Error executing 
 query
 [junit]   at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
 [junit]   at 
 org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:290)
 [junit]   at 
 org.apache.solr.cloud.BasicDistributedZkTest.queryServer(BasicDistributedZkTest.java:256)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:305)
 [junit]   at 
 org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:227)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:562)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:795)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:768)
 [junit] Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond  org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325)at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
  at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) 
 at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)   
   at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) 
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) 
 at org.mortbay.jetty.Server.handle(Server.java:326) at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)  
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)  at 
 org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at 
 org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)  
at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483)
   at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.reque
 [junit] 
 [junit] org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond  org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respondat 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318)
 at 
 

[jira] [Created] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml

2012-05-23 Thread Emma Bo Liu (JIRA)
Emma Bo Liu created SOLR-3482:
-

 Summary: Cannot index emails, mistakes of configuration file 
data-config.xml solrconfig.xml
 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu


The mail core cannot be brought up. There are mistakes of data-config.xml 
solrconfig.xml. It cannot find the tika. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml

2012-05-23 Thread Emma Bo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emma Bo Liu updated SOLR-3482:
--

Priority: Minor  (was: Major)

 Cannot index emails, mistakes of configuration file data-config.xml 
 solrconfig.xml
 --

 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu
Priority: Minor
  Labels: core, email, index, solr, tika

 The mail core cannot be brought up. There are mistakes of data-config.xml 
 solrconfig.xml. It cannot find the tika. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml

2012-05-23 Thread Emma Bo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emma Bo Liu updated SOLR-3482:
--

Description: The mail core cannot be brought up. There are mistakes of 
data-config.xml solrconfig.xml. It cannot find the tika. The example of mail 
core is not complete, miss files.  (was: The mail core cannot be brought up. 
There are mistakes of data-config.xml solrconfig.xml. It cannot find the tika. )

 Cannot index emails, mistakes of configuration file data-config.xml 
 solrconfig.xml
 --

 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu
Priority: Minor
  Labels: core, email, index, solr, tika

 The mail core cannot be brought up. There are mistakes of data-config.xml 
 solrconfig.xml. It cannot find the tika. The example of mail core is not 
 complete, miss files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Using term offsets for hit highlighting

2012-05-23 Thread Alan Woodward
OK, so the most straightforward way to do that would be to change the signature 
to positions(boolean needsPayloads, boolean needsOffsets), I guess.  This is a 
new API so it's not breaking anything.  

It'll be tomorrow morning before I have a proper go at this now (Cambridge Beer 
Festival tonight…).  Is the mailing list the best place to discuss this, or is 
JIRA/IRC better?

On 23 May 2012, at 13:43, Simon Willnauer wrote:

 hey alan,
 
 I added position iterator support to ConjunctionTermScorer and
 committed it to the branch. All tests that don't rely on payloads are
 passing in core. Previously we had to decide if we need positions up
 front, the current code can pull them lazily which causes less changes
 on the Scorer API. I think we should keep it that way, the only
 problem is that we have currently now way to pass information to the
 iterators if we need payloads or not. Same is true for offsets since
 they are now in the index. I think it would be good if you could
 tackle the payloads first and pass some info to the Scorer#positions()
 method so we can pull the right thing.
 
 happy coding.
 
 simon
 
 On Wed, May 23, 2012 at 1:23 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sweet, thanks Simon.  I'll have a go at getting some failing tests passing 
 to begin with.
 
 On 23 May 2012, at 11:59, Simon Willnauer wrote:
 
 alan,
 
 I merged the branch manually and created a new branch from it. its
 here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878
 the branch compiles but lots of nocommits / todos
 
 if you have questions please ask I will help as much as I can
 
 simon
 
 On Tue, May 22, 2012 at 8:38 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Hey, I reckon I can have a decent go at getting the branch updated.  Is it 
 best to work this out as a patch applying to trunk?  Any patch that merges 
 in all the trunk changes to the branch is going to be absolutely massive…
 
 On 17 May 2012, at 13:15, Simon Willnauer wrote:
 
 ok man. I will try to merge up the branch. I tell you this is going to
 be messy and it might not compile but I will make it reasonable so you
 can start.
 
 simon
 
 On Thu, May 17, 2012 at 8:03 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sorry for vanishing for so long, life unexpectedly caught up with me...  
 I'm going to have some time to look at this again next week though, if 
 you're interested in picking it up again.
 
 On 21 Mar 2012, at 09:02, Alan Woodward wrote:
 
 That would be great, thanks!  I had a go at merging it last night, but 
 there are a *lot* of changes that I haven't got my head round yet, so 
 it was getting pretty messy.
 
 On 21 Mar 2012, at 08:49, Simon Willnauer wrote:
 
 Alan, if you want I can just merge the branch up next week and we
 iterate from there?
 
 simon
 
 On Tue, Mar 20, 2012 at 12:34 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 Yep, the first challenge is always getting the old patch(es) to 
 apply.
 
 On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Thanks for all the offers of help!  It looks as though most of the 
 hard work has already been done, which is exactly where I like to 
 pick up projects.  :-)
 
 Maybe the best place to start would be for me to rebase the branch 
 against trunk, and see what still fits?  I think there have been 
 some fairly major changes in the internals since July last year.
 
 On 19 Mar 2012, at 17:07, Mike Sokolov wrote:
 
 I posted a patch with a Collector somewhat similar to what you 
 described, Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a 
 fairly complete alpha state, but has seen no production use of 
 course, since it relies on the remainder of the unfinished work in 
 that branch.  It works by creating a TokenStream based on match 
 positions returned from the query and passing that to the existing 
 Highlighter.  Please feel free to get in touch if you decide to 
 look into that and have questions.
 
 
 -Mike
 
 On 03/19/2012 11:51 AM, Simon Willnauer wrote:
 On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindleru...@thetaphi.de  
 wrote:
 
 Have you marked that for GSOC? Would be a good idea!
 
 yes I did
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Monday, March 19, 2012 4:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Using term offsets for hit highlighting
 
 Alan, you made my day!
 
 The branch is kind of outdated but I looked at it lately and I 
 can certainly help
 to get it up to speed. The feature in that branch is quite a big 
 one and its in a
 very early stage. Still I want to encourage you to take a look 
 and work on it. I
 promise all my help with the issues!
 
 let me know if you have questions!
 
 simon
 
 On Mon, Mar 19, 

[jira] [Commented] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml

2012-05-23 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281719#comment-13281719
 ] 

Stefan Matheis (steffkes) commented on SOLR-3482:
-

Emma, did you find something else in data-config.xml than the missing 
entity-name, which is already reported  fixed in SOLR-3478?

 Cannot index emails, mistakes of configuration file data-config.xml 
 solrconfig.xml
 --

 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu
Priority: Minor
  Labels: core, email, index, solr, tika

 The mail core cannot be brought up. There are mistakes of data-config.xml 
 solrconfig.xml. It cannot find the tika. The example of mail core is not 
 complete, miss files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml

2012-05-23 Thread Emma Bo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emma Bo Liu updated SOLR-3482:
--

Description: The mail core cannot be brought up. There are mistakes of 
data-config.xml solrconfig.xml. It cannot find the tika. The example of mail 
core is not complete, miss files.There is mistake of the sor 
mailEnitityPorcessor tutorial.  (was: The mail core cannot be brought up. There 
are mistakes of data-config.xml solrconfig.xml. It cannot find the tika. The 
example of mail core is not complete, miss files.)

 Cannot index emails, mistakes of configuration file data-config.xml 
 solrconfig.xml
 --

 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu
Priority: Minor
  Labels: core, email, index, solr, tika

 The mail core cannot be brought up. There are mistakes of data-config.xml 
 solrconfig.xml. It cannot find the tika. The example of mail core is not 
 complete, miss files.There is mistake of the sor mailEnitityPorcessor 
 tutorial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml

2012-05-23 Thread Emma Bo Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281724#comment-13281724
 ] 

Emma Bo Liu commented on SOLR-3482:
---

on the mail core data-config.xml of example-DIH,the entity doesn't have a name 
and neither on the solr mailEntityProcessor tutorial. I am glad the issue with 
entity name solved.But there are still other mistake in mail core and tika. I 
will update the patch with correct mail-core configuration quickly. 

 Cannot index emails, mistakes of configuration file data-config.xml 
 solrconfig.xml
 --

 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu
Priority: Minor
  Labels: core, email, index, solr, tika

 The mail core cannot be brought up. There are mistakes of data-config.xml 
 solrconfig.xml. It cannot find the tika. The example of mail core is not 
 complete, miss files.There is mistake of the sor mailEnitityPorcessor 
 tutorial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3482) Cannot index emails, mistakes of configuration file data-config.xml solrconfig.xml, Cannot find tika

2012-05-23 Thread Emma Bo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emma Bo Liu updated SOLR-3482:
--

Description: 
The mail core cannot be brought up. There are mistakes of data-config.xml 
solrconfig.xml. The example of mail core is not complete, miss files.There is 
mistake of the sor mailEnitityPorcessor tutorial.

It cannot find the tika even tough it include the dataimporter-extra jar file. 


  was:The mail core cannot be brought up. There are mistakes of data-config.xml 
solrconfig.xml. It cannot find the tika. The example of mail core is not 
complete, miss files.There is mistake of the sor mailEnitityPorcessor tutorial.

Summary: Cannot index emails, mistakes of configuration file 
data-config.xml solrconfig.xml, Cannot find tika   (was: Cannot index emails, 
mistakes of configuration file data-config.xml solrconfig.xml)

 Cannot index emails, mistakes of configuration file data-config.xml 
 solrconfig.xml, Cannot find tika 
 -

 Key: SOLR-3482
 URL: https://issues.apache.org/jira/browse/SOLR-3482
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.0
 Environment: windows
Reporter: Emma Bo Liu
Priority: Minor
  Labels: core, email, index, solr, tika

 The mail core cannot be brought up. There are mistakes of data-config.xml 
 solrconfig.xml. The example of mail core is not complete, miss files.There is 
 mistake of the sor mailEnitityPorcessor tutorial.
 It cannot find the tika even tough it include the dataimporter-extra jar 
 file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281751#comment-13281751
 ] 

Michael McCandless commented on LUCENE-4062:


bq. A third option could be to write padding bits (Packed64SingleBlock 
subclasses may have such padding bits) as well, but I really dislike the fact 
that the on-disk format is implementation-dependent.

Actually, I think we should stop specializing based on 32 bit vs 64
bit JRE, and always use the impls backed by long[] (Packed64*).  Then, I
think it's fine if we write the long[] image (with padding bits)
directly to disk?


 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance improvements).
 I think this would allow more fine-grained control over the speed/space 
 trade-off, what do you 

Build failed in Jenkins: Lucene-Solr-trunk-Linux-Java6-64 #489

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/489/

--
[...truncated 12159 lines...]
   [junit4] Suite: org.apache.solr.core.RAMDirectoryFactoryTest
   [junit4] Completed on J1 in 0.01s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.analysis.TestHyphenationCompoundWordTokenFilterFactory
   [junit4] Completed on J1 in 0.08s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest
   [junit4] Completed on J0 in 0.56s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestPhoneticFilterFactory
   [junit4] Completed on J0 in 9.44s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.FieldAnalysisRequestHandlerTest
   [junit4] Completed on J0 in 0.57s, 4 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedSpellCheckComponentTest
   [junit4] Completed on J0 in 6.83s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.LeaderElectionTest
   [junit4] Completed on J1 in 16.51s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.request.TestFaceting
   [junit4] Completed on J1 in 7.84s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.TestDistributedSearch
   [junit4] Completed on J0 in 15.25s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.request.SimpleFacetsTest
   [junit4] Completed on J0 in 3.50s, 29 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.ZkControllerTest
   [junit4] Completed on J1 in 6.89s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.SolrCoreTest
   [junit4] Completed on J1 in 3.47s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestSort
   [junit4] Completed on J0 in 3.93s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.SolrInfoMBeanTest
   [junit4] Completed on J0 in 0.54s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.BasicFunctionalityTest
   [junit4] IGNORED 0.00s J1 | BasicFunctionalityTest.testDeepPaging
   [junit4] Cause: Annotated @Ignore(See SOLR-1726)
   [junit4] Completed on J1 in 1.49s, 23 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.IndexBasedSpellCheckerTest
   [junit4] Completed on J1 in 0.77s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.update.SolrCmdDistributorTest
   [junit4] Completed on J0 in 1.13s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.admin.LukeRequestHandlerTest
   [junit4] Completed on J1 in 1.53s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestCoreContainer
   [junit4] Completed on J0 in 1.59s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.request.TestWriterPerf
   [junit4] Completed on J1 in 0.74s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestWordDelimiterFilterFactory
   [junit4] Completed on J0 in 0.86s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.distance.DistanceFunctionTest
   [junit4] Completed on J1 in 0.60s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.XsltUpdateRequestHandlerTest
   [junit4] Completed on J0 in 0.67s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.SolrCoreCheckLockOnStartupTest
   [junit4] Completed on J1 in 0.93s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.DocumentAnalysisRequestHandlerTest
   [junit4] Completed on J0 in 0.63s, 4 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.FieldMutatingUpdateProcessorTest
   [junit4] Completed on J1 in 0.49s, 20 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.RequestHandlersTest
   [junit4] Completed on J0 in 0.61s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.FileBasedSpellCheckerTest
   [junit4] Completed on J1 in 0.60s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.PrimitiveFieldTypeTest
   [junit4] Completed on J0 in 0.74s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestQueryUtils
   [junit4] Completed on J1 in 0.66s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestValueSourceCache
   [junit4] Completed on J0 in 0.65s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.response.TestPHPSerializedResponseWriter
   [junit4] Completed on J1 in 0.49s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.DisMaxRequestHandlerTest
   [junit4] Completed on J0 in 0.62s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.util.SolrPluginUtilsTest
   [junit4] Completed on J1 in 0.58s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.IndexReaderFactoryTest
   [junit4] Completed on J1 in 0.44s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.RequiredFieldsTest
   [junit4] Completed on J0 in 0.46s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.request.JSONWriterTest
   [junit4] Completed on J1 in 0.48s, 3 tests
   [junit4]  
   [junit4] Suite: 

[jira] [Resolved] (SOLR-3481) Date field value differs between two installations

2012-05-23 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-3481.


Resolution: Incomplete

David: there isn't enough information here to understand what problem you might 
be havng, or if there is infact actually a bug in solr (as opposed to a 
configuration discrepancy in your setup)

please start a thread on the solr-user@lucene mailing list with more details 
(ie: your schema.xml, including field types, examples documents you index, 
example queries you run, what output you get from those queries etc...) 

https://wiki.apache.org/solr/UsingMailingLists

 Date field value differs between two installations
 --

 Key: SOLR-3481
 URL: https://issues.apache.org/jira/browse/SOLR-3481
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 3.6
 Environment: A. Mac 10.7.4 with integrated Jetty
 B. Ubuntu 12.04 with Tomcat
Reporter: David Rekowski
  Labels: datefield,, format, mac

 When I query the Solr Server, I get a formatted timestamp in environment A 
 2012-05-11T12:59:01.691Z, whereas I get a unix timestamp like number in 
 environment B 1336728376797 which looks like the date extended by 
 microseconds.
 The corresponding schema definition:
field name=index_time_s type=date indexed=true stored=true 
 default=NOW multiValued=false/
 Background: We migrated an index generated on a mac/jetty to a linux/tomcat 
 installation of Solr. Regardless of that, this happens with newly indexed 
 documents as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Using term offsets for hit highlighting

2012-05-23 Thread Simon Willnauer
Hey Alan,


On Wed, May 23, 2012 at 6:46 PM, Alan Woodward
alan.woodw...@romseysoftware.co.uk wrote:
 OK, so the most straightforward way to do that would be to change the 
 signature to positions(boolean needsPayloads, boolean needsOffsets), I guess. 
  This is a new API so it's not breaking anything.

yeah I'd think so. this is also consistent how we pull scorers  its
safe in terms of changes ie. you won't miss an API change vs. using a
struct like object. I am not sure how we expose the offsets yet but
for now lets make the tests pass. That should provide you a good and
straight forward start though. Don't worry about the API for now, we
are in a dev phase that doesn't need to produce a fixed API we will
straighten that out iteratively as we go.


 It'll be tomorrow morning before I have a proper go at this now (Cambridge 
 Beer Festival tonight…).  Is the mailing list the best place to discuss this, 
 or is JIRA/IRC better?

patches should go on the issue and code discussions related to the
patches too. It might make sense to have discussion of a broader scope
on the dev list, decisions made on the list should be referenced on
the issue.  IRC might make sense too if you have some questions that
are better answered interactively. Yet, any decisions should also be
discussed here or on the issue. If something we discussed on IRC leads
to some design decisions its wise to repeat them on the issue so folks
can reproduce the decision making process. In any case if its IRC make
sure it #lucene-dev

looking forward to the patches...

simon

 On 23 May 2012, at 13:43, Simon Willnauer wrote:

 hey alan,

 I added position iterator support to ConjunctionTermScorer and
 committed it to the branch. All tests that don't rely on payloads are
 passing in core. Previously we had to decide if we need positions up
 front, the current code can pull them lazily which causes less changes
 on the Scorer API. I think we should keep it that way, the only
 problem is that we have currently now way to pass information to the
 iterators if we need payloads or not. Same is true for offsets since
 they are now in the index. I think it would be good if you could
 tackle the payloads first and pass some info to the Scorer#positions()
 method so we can pull the right thing.

 happy coding.

 simon

 On Wed, May 23, 2012 at 1:23 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sweet, thanks Simon.  I'll have a go at getting some failing tests passing 
 to begin with.

 On 23 May 2012, at 11:59, Simon Willnauer wrote:

 alan,

 I merged the branch manually and created a new branch from it. its
 here: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-2878
 the branch compiles but lots of nocommits / todos

 if you have questions please ask I will help as much as I can

 simon

 On Tue, May 22, 2012 at 8:38 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Hey, I reckon I can have a decent go at getting the branch updated.  Is 
 it best to work this out as a patch applying to trunk?  Any patch that 
 merges in all the trunk changes to the branch is going to be absolutely 
 massive…

 On 17 May 2012, at 13:15, Simon Willnauer wrote:

 ok man. I will try to merge up the branch. I tell you this is going to
 be messy and it might not compile but I will make it reasonable so you
 can start.

 simon

 On Thu, May 17, 2012 at 8:03 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Sorry for vanishing for so long, life unexpectedly caught up with me... 
  I'm going to have some time to look at this again next week though, if 
 you're interested in picking it up again.

 On 21 Mar 2012, at 09:02, Alan Woodward wrote:

 That would be great, thanks!  I had a go at merging it last night, but 
 there are a *lot* of changes that I haven't got my head round yet, so 
 it was getting pretty messy.

 On 21 Mar 2012, at 08:49, Simon Willnauer wrote:

 Alan, if you want I can just merge the branch up next week and we
 iterate from there?

 simon

 On Tue, Mar 20, 2012 at 12:34 PM, Erick Erickson
 erickerick...@gmail.com wrote:
 Yep, the first challenge is always getting the old patch(es) to 
 apply.

 On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk wrote:
 Thanks for all the offers of help!  It looks as though most of the 
 hard work has already been done, which is exactly where I like to 
 pick up projects.  :-)

 Maybe the best place to start would be for me to rebase the branch 
 against trunk, and see what still fits?  I think there have been 
 some fairly major changes in the internals since July last year.

 On 19 Mar 2012, at 17:07, Mike Sokolov wrote:

 I posted a patch with a Collector somewhat similar to what you 
 described, Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a 
 fairly complete alpha state, but has seen no production use of 
 course, since it relies on the remainder of the unfinished work 

Jenkins build is back to normal : Lucene-Solr-trunk-Linux-Java6-64 #490

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/490/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2443) Don't assume IntsRef offset is 0 after postings bulk read

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281783#comment-13281783
 ] 

Simon Willnauer commented on LUCENE-2443:
-

seems like this is invalid now no?

 Don't assume IntsRef offset is 0 after postings bulk read
 -

 Key: LUCENE-2443
 URL: https://issues.apache.org/jira/browse/LUCENE-2443
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


 Yonik found 2 places where we assume the ints starts at offset=0 after bulk 
 read -- we can't do this because in general a codec can give us a slice into 
 private int[] arrays, eg int block codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Change Similarity in Solr MoreLikeThis

2012-05-23 Thread Emmanuel Espina
LUCENE-896 added support for changing the Similarity class of More
Like This but this functionality has not been exposed to Solr. I would
like to create a jira and submit a patch for this.

Do you agree?

Thanks
Emmanuel

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3715) TestStressIndexing2 failes with AssertionFailedError

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3715.
-

Resolution: Cannot Reproduce

closing this for now... it never reproduced

 TestStressIndexing2 failes with AssertionFailedError
 

 Key: LUCENE-3715
 URL: https://issues.apache.org/jira/browse/LUCENE-3715
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.0


 JENKINS reported this lately, I suspect a test issue due to the 
 RandomDWPThreadPool but I need to dig deeper.
 here is the failure to reproduce:
 {noformat}
 [junit] Testcase: 
 testMultiConfig(org.apache.lucene.index.TestStressIndexing2):   FAILED
 [junit] r1 is not empty but r2 is
 [junit] junit.framework.AssertionFailedError: r1 is not empty but r2 is
 [junit]   at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
 [junit]   at 
 org.apache.lucene.index.TestStressIndexing2.verifyEquals(TestStressIndexing2.java:339)
 [junit]   at 
 org.apache.lucene.index.TestStressIndexing2.verifyEquals(TestStressIndexing2.java:277)
 [junit]   at 
 org.apache.lucene.index.TestStressIndexing2.testMultiConfig(TestStressIndexing2.java:126)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:529)
 [junit] 
 [junit] 
 [junit] Tests run: 3, Failures: 1, Errors: 0, Time elapsed: 2.598 sec
 [junit] 
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test -Dtestcase=TestStressIndexing2 
 -Dtestmethod=testMultiConfig 
 -Dtests.seed=5df78431615a5fbf:45b35512c8b8741a:235b5758de97148e 
 -Dtests.multiplier=3 -Dtests.nightly=true -Dargs=-Dfile.encoding=ISO8859-1
 [junit] NOTE: test params are: codec=Lucene3x, 
 sim=RandomSimilarityProvider(queryNorm=true,coord=true): {f34=DFR GZ(0.3), 
 f33=IB SPL-D2, f32=DFR I(n)B2, f31=DFR I(ne)B1, f30=IB LL-L2, f79=DFR 
 I(n)3(800.0), f78=DFR I(F)L2, f75=DFR I(n)BZ(0.3), f76=DFR GLZ(0.3), f39=DFR 
 I(n)BZ(0.3), f38=DFR I(F)3(800.0), f73=DFR I(ne)L1, f74=DFR I(F)3(800.0), 
 f37=DFR I(ne)L1, f36=DFR I(ne)3(800.0), f71=DFR I(F)B3(800.0), f35=DFR 
 I(F)B3(800.0), f72=DFR I(ne)3(800.0), f81=DFR GZ(0.3), f80=IB SPL-D2, f43=DFR 
 I(ne)BZ(0.3), f42=DFR I(F)Z(0.3), f45=IB SPL-L2, f41=DFR I(F)BZ(0.3), f40=DFR 
 I(n)B1, f86=DFR I(ne)B3(800.0), f87=DFR GB1, f88=IB SPL-D3(800.0), f89=DFR 
 I(F)L3(800.0), f82=DFR GL2, f47=DFR I(ne)LZ(0.3), f46=DFR GL2, f83=DFR 
 I(ne)LZ(0.3), f49=DFR I(ne)Z(0.3), f84=DFR I(F)B2, f48=DFR I(F)B2, f85=DFR 
 I(ne)Z(0.3), f90=DFR I(ne)BZ(0.3), f92=IB SPL-L2, f91=DFR I(n)Z(0.3), f59=DFR 
 G2, f6=IB SPL-DZ(0.3), f7=IB LL-L1, f57=IB LL-L3(800.0), f8=DFR 
 I(n)L3(800.0), f58=DFR I(n)LZ(0.3), f12=DFR I(F)1, f11=DFR I(n)L2, f10=DFR 
 I(F)LZ(0.3), f51=DFR I(n)L1, f15=DFR I(n)L1, f52=DFR I(F)L2, f14=DFR 
 GLZ(0.3), f13=DFR I(n)BZ(0.3), f55=DFR GL3(800.0), f19=DFR GL3(800.0), f56=IB 
 LL-L2, f53=DFR I(F)L1, f18=BM25(k1=1.2,b=0.75), f17=DFR I(F)L1, 
 f54=BM25(k1=1.2,b=0.75), id=DFR I(F)L2, f1=DFR I(n)B3(800.0), f0=DFR G2, 
 f3=DFR I(ne)3(800.0), f2=DFR I(F)B3(800.0), f5=DFR I(F)3(800.0), f4=DFR 
 I(ne)L1, f68=DFR I(n)2, f69=DFR I(ne)2, f21=IB LL-LZ(0.3), f20=DFR I(n)1, 
 f23=DFR GB2, f22=DFR I(ne)B2, f60=DFR I(ne)B3(800.0), f25=DFR GB1, f61=DFR 
 GB1, f24=DFR I(ne)B3(800.0), f62=IB SPL-D3(800.0), f27=DFR I(F)L3(800.0), 
 f26=IB SPL-D3(800.0), f63=DFR I(F)L3(800.0), f64=DFR GL1, f29=DFR I(ne)1, 
 f65=DFR I(ne)1, f28=DFR GL1, f66=DFR I(n)B1, f67=DFR I(F)BZ(0.3), f98=DFR 
 I(n)LZ(0.3), f97=IB LL-L3(800.0), f99=DFR G2, f94=DefaultSimilarity, f93=DFR 
 I(n)3(800.0), f70=DFR GB2, f96=LM Jelinek-Mercer(0.70), f95=DFR 
 GBZ(0.3)}, locale=ms, timezone=Africa/Bangui
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestDemo, TestSearch, TestCachingTokenFilter, TestSurrogates, 
 TestPulsingReuse, TestAddIndexes, TestBinaryTerms, TestCodecs, 
 TestCrashCausesCorruptIndex, TestDocsAndPositions, TestFieldInfos, 
 TestFilterIndexReader, TestFlex, TestIndexReader, TestIndexWriterMergePolicy, 
 TestIndexWriterNRTIsCurrent, TestIndexWriterOnJRECrash, 
 TestIndexWriterWithThreads, TestNeverDelete, TestNoDeletionPolicy, 
 TestOmitNorms, TestParallelReader, TestPayloads, TestRandomStoredFields, 
 TestRollback, TestRollingUpdates, TestSegmentInfo, TestStressIndexing2]
 [junit] NOTE: FreeBSD 8.2-RELEASE amd64/Sun Microsystems Inc. 1.6.0 
 (64-bit)/cpus=16,threads=1,free=349545000,total=477233152
 {noformat}
 this failed on revision:
 http://svn.apache.org/repos/asf/lucene/dev/trunk : 1233708

--
This message is automatically generated 

[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281793#comment-13281793
 ] 

Adrien Grand commented on LUCENE-4062:
--

Mike, I am not sure how we should do it. For 21-bits values how would the 
reader know whether it should use a Packed64SingleBlock21 or a Packed64? Should 
we add a flag to the data stream in order to know what implementation 
serialized the integers?

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance improvements).
 I think this would allow more fine-grained control over the speed/space 
 trade-off, what do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (LUCENE-2504) sorting performance regression

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281794#comment-13281794
 ] 

Simon Willnauer commented on LUCENE-2504:
-

yonik, I see a bunch of commits on this issue, can we resolve this?


 sorting performance regression
 --

 Key: LUCENE-2504
 URL: https://issues.apache.org/jira/browse/LUCENE-2504
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Affects Versions: 4.0
Reporter: Yonik Seeley
 Fix For: 4.0

 Attachments: LUCENE-2504.patch, LUCENE-2504.patch, LUCENE-2504.patch, 
 LUCENE-2504.zip, LUCENE-2504_SortMissingLast.patch


 sorting can be much slower on trunk than branch_3x

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281797#comment-13281797
 ] 

Michael McCandless commented on LUCENE-4062:


bq. Should we add a flag to the data stream in order to know what 
implementation serialized the integers?

I think so?

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance improvements).
 I think this would allow more fine-grained control over the speed/space 
 trade-off, what do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LUCENE-4018) Make accessible subenums in MappingMultiDocsEnum

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281798#comment-13281798
 ] 

Simon Willnauer commented on LUCENE-4018:
-

this makes sense to me though any objections?

 Make accessible subenums in MappingMultiDocsEnum
 

 Key: LUCENE-4018
 URL: https://issues.apache.org/jira/browse/LUCENE-4018
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0
Reporter: Renaud Delbru
  Labels: codec, flex, merge
 Fix For: 4.0

 Attachments: LUCENE-4018.patch


 The #merge method of the PostingsConsumer receives MappingMultiDocsEnum and 
 MappingMultiDocsAndPositionsEnum as postings enum. In certain case (with 
 specific postings formats), the #merge method needs to be overwritten, and 
 the underlying DocsEnums wrapped by the MappingMultiDocsEnum need to be 
 accessed.
 The MappingMultiDocsEnum class should provide a method #getSubs similarly to 
 MultiDocsEnum class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4018) Make accessible subenums in MappingMultiDocsEnum

2012-05-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281804#comment-13281804
 ] 

Michael McCandless commented on LUCENE-4018:


+1

 Make accessible subenums in MappingMultiDocsEnum
 

 Key: LUCENE-4018
 URL: https://issues.apache.org/jira/browse/LUCENE-4018
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0
Reporter: Renaud Delbru
  Labels: codec, flex, merge
 Fix For: 4.0

 Attachments: LUCENE-4018.patch


 The #merge method of the PostingsConsumer receives MappingMultiDocsEnum and 
 MappingMultiDocsAndPositionsEnum as postings enum. In certain case (with 
 specific postings formats), the #merge method needs to be overwritten, and 
 the underlying DocsEnums wrapped by the MappingMultiDocsEnum need to be 
 accessed.
 The MappingMultiDocsEnum class should provide a method #getSubs similarly to 
 MultiDocsEnum class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281805#comment-13281805
 ] 

Adrien Grand commented on LUCENE-4062:
--

Isn't it a problem to break compatibility? Or should we use special ( 64) 
values of bitsPerValue so that current trunk indexes will still work after the 
patch is applied?

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance improvements).
 I think this would allow more fine-grained control over the speed/space 
 trade-off, what do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Assigned] (LUCENE-4018) Make accessible subenums in MappingMultiDocsEnum

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-4018:
---

Assignee: Simon Willnauer

 Make accessible subenums in MappingMultiDocsEnum
 

 Key: LUCENE-4018
 URL: https://issues.apache.org/jira/browse/LUCENE-4018
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0
Reporter: Renaud Delbru
Assignee: Simon Willnauer
  Labels: codec, flex, merge
 Fix For: 4.0

 Attachments: LUCENE-4018.patch


 The #merge method of the PostingsConsumer receives MappingMultiDocsEnum and 
 MappingMultiDocsAndPositionsEnum as postings enum. In certain case (with 
 specific postings formats), the #merge method needs to be overwritten, and 
 the underlying DocsEnums wrapped by the MappingMultiDocsEnum need to be 
 accessed.
 The MappingMultiDocsEnum class should provide a method #getSubs similarly to 
 MultiDocsEnum class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4018) Make accessible subenums in MappingMultiDocsEnum

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-4018.
-

Resolution: Fixed

committed to trunk!

thanks Renaud

 Make accessible subenums in MappingMultiDocsEnum
 

 Key: LUCENE-4018
 URL: https://issues.apache.org/jira/browse/LUCENE-4018
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0
Reporter: Renaud Delbru
Assignee: Simon Willnauer
  Labels: codec, flex, merge
 Fix For: 4.0

 Attachments: LUCENE-4018.patch


 The #merge method of the PostingsConsumer receives MappingMultiDocsEnum and 
 MappingMultiDocsAndPositionsEnum as postings enum. In certain case (with 
 specific postings formats), the #merge method needs to be overwritten, and 
 the underlying DocsEnums wrapped by the MappingMultiDocsEnum need to be 
 accessed.
 The MappingMultiDocsEnum class should provide a method #getSubs similarly to 
 MultiDocsEnum class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen

2012-05-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281814#comment-13281814
 ] 

Michael McCandless commented on LUCENE-4062:


bq. Isn't it a problem to break compatibility? 

It isn't.

3x indices never store packed ints ... so we are only breaking doc values in 
4.0, and we are allowed (for only a bit more time!) to break 4.0's index 
format.  So we should just break it and not pollute 4.0's sources with false 
back compat code...

Separately, if somehow we did need to preserve back compat for packed ints file 
format... we should use the version in the codec header to accomplish that (ie, 
we don't have to stuff version information inside the bitsPerValue).

 More fine-grained control over the packed integer implementation that is 
 chosen
 ---

 Key: LUCENE-4062
 URL: https://issues.apache.org/jira/browse/LUCENE-4062
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/other
Reporter: Adrien Grand
Assignee: Michael McCandless
Priority: Minor
  Labels: performance
 Fix For: 4.1

 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, 
 LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch


 In order to save space, Lucene has two main PackedInts.Mutable implentations, 
 one that is very fast and is based on a byte/short/integer/long array 
 (Direct*) and another one which packs bits in a memory-efficient manner 
 (Packed*).
 The packed implementation tends to be much slower than the direct one, which 
 discourages some Lucene components to use it. On the other hand, if you store 
 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%.
 If you accept to trade some space for speed, you could store 3 of these 21 
 bits integers in a long, resulting in an overhead of 1/3 bit per value. One 
 advantage of this approach is that you never need to read more than one block 
 to read or write a value, so this can be significantly faster than Packed32 
 and Packed64 which always need to read/write two blocks in order to avoid 
 costly branches.
 I ran some tests, and for 1000 21 bits values, this implementation takes 
 less than 2% more space and has 44% faster writes and 30% faster reads. The 
 12 bits version (5 values per block) has the same performance improvement and 
 a 6% memory overhead compared to the packed implementation.
 In order to select the best implementation for a given integer size, I wrote 
 the {{PackedInts.getMutable(valueCount, bitsPerValue, 
 acceptableOverheadPerValue)}} method. This method select the fastest 
 implementation that has less than {{acceptableOverheadPerValue}} wasted bits 
 per value. For example, if you accept an overhead of 20% 
 ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty 
 reasonable, here is what implementations would be selected:
  * 1: Packed64SingleBlock1
  * 2: Packed64SingleBlock2
  * 3: Packed64SingleBlock3
  * 4: Packed64SingleBlock4
  * 5: Packed64SingleBlock5
  * 6: Packed64SingleBlock6
  * 7: Direct8
  * 8: Direct8
  * 9: Packed64SingleBlock9
  * 10: Packed64SingleBlock10
  * 11: Packed64SingleBlock12
  * 12: Packed64SingleBlock12
  * 13: Packed64
  * 14: Direct16
  * 15: Direct16
  * 16: Direct16
  * 17: Packed64
  * 18: Packed64SingleBlock21
  * 19: Packed64SingleBlock21
  * 20: Packed64SingleBlock21
  * 21: Packed64SingleBlock21
  * 22: Packed64
  * 23: Packed64
  * 24: Packed64
  * 25: Packed64
  * 26: Packed64
  * 27: Direct32
  * 28: Direct32
  * 29: Direct32
  * 30: Direct32
  * 31: Direct32
  * 32: Direct32
  * 33: Packed64
  * 34: Packed64
  * 35: Packed64
  * 36: Packed64
  * 37: Packed64
  * 38: Packed64
  * 39: Packed64
  * 40: Packed64
  * 41: Packed64
  * 42: Packed64
  * 43: Packed64
  * 44: Packed64
  * 45: Packed64
  * 46: Packed64
  * 47: Packed64
  * 48: Packed64
  * 49: Packed64
  * 50: Packed64
  * 51: Packed64
  * 52: Packed64
  * 53: Packed64
  * 54: Direct64
  * 55: Direct64
  * 56: Direct64
  * 57: Direct64
  * 58: Direct64
  * 59: Direct64
  * 60: Direct64
  * 61: Direct64
  * 62: Direct64
 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still 
 choose the slower Packed64 implementation. Allowing a 50% overhead would 
 prevent the packed implementation to be selected for bits per value under 32. 
 Allowing an overhead of 32 bits per value would make sure that a Direct* 
 implementation is always selected.
 Next steps would be to:
  * make lucene components use this {{getMutable}} method and let users decide 
 what trade-off better suits them,
  * write a Packed32SingleBlock implementation if necessary (I didn't do it 
 because I have no 32-bits computer to test the performance 

[jira] [Commented] (SOLR-2161) BasicDistributedZkTest.testDistribSearch test failure

2012-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281815#comment-13281815
 ] 

Dawid Weiss commented on SOLR-2161:
---

Thanks Yonik!

 BasicDistributedZkTest.testDistribSearch test failure
 -

 Key: SOLR-2161
 URL: https://issues.apache.org/jira/browse/SOLR-2161
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
 Environment: Hudson
Reporter: Robert Muir
 Fix For: 4.0


 BasicDistributedZkTest.testDistribSearch failed in Hudson.
 Here is the stacktrace:
 {noformat}
 [junit] Testsuite: org.apache.solr.cloud.BasicDistributedZkTest
 [junit] Testcase: 
 testDistribSearch(org.apache.solr.cloud.BasicDistributedZkTest):
 Caused an ERROR
 [junit] Error executing query
 [junit] org.apache.solr.client.solrj.SolrServerException: Error executing 
 query
 [junit]   at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
 [junit]   at 
 org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:290)
 [junit]   at 
 org.apache.solr.cloud.BasicDistributedZkTest.queryServer(BasicDistributedZkTest.java:256)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:305)
 [junit]   at 
 org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:227)
 [junit]   at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:562)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:795)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:768)
 [junit] Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond  org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325)at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:337)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
  at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) 
 at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)   
   at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) 
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) 
 at org.mortbay.jetty.Server.handle(Server.java:326) at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)  
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)  at 
 org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at 
 org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at 
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)  
at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:483)
   at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.reque
 [junit] 
 [junit] org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respond  org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: 
 org.apache.commons.httpclient.NoHttpResponseException: The server 127.0.0.1 
 failed to respondat 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:318)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1325)at 
 

[jira] [Commented] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281833#comment-13281833
 ] 

Simon Willnauer commented on LUCENE-4074:
-

I will commit this soon if nobody objects.

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB

2012-05-23 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281835#comment-13281835
 ] 

Robert Muir commented on LUCENE-4074:
-

+1, nice catch

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Change Similarity in Solr MoreLikeThis

2012-05-23 Thread Tommaso Teofili
Hi Emmanuel,
Sure, go for it!
Cheers,
Tommaso

2012/5/23 Emmanuel Espina espinaemman...@gmail.com

 LUCENE-896 added support for changing the Similarity class of More
 Like This but this functionality has not been exposed to Solr. I would
 like to create a jira and submit a patch for this.

 Do you agree?

 Thanks
 Emmanuel

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Resolved] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-4074.
-

Resolution: Fixed

committed to trunk and 3.6 branch

 FST Sorter BufferSize causes int overflow if BufferSize  2048MB
 

 Key: LUCENE-4074
 URL: https://issues.apache.org/jira/browse/LUCENE-4074
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0, 3.6.1

 Attachments: LUCENE-4074.patch


 the BufferSize constructor accepts size in MB as an integer and uses 
 multiplication to convert to bytes. While its checking the size in bytes to 
 be less than 2048 MB it does that after byte conversion. If you pass a value 
  2047 to the ctor the value overflows since all constants and methods based 
 on MB expect 32 bit signed ints. This does not even result in an exception 
 until the BufferSize is actually passed to the sorter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2025) Ability to turn off the store for an index

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2025:


Fix Version/s: (was: 4.0)
   4.1

moving this over to 4.1 this won't happen in 4.0 anymore

 Ability to turn off the store for an index
 --

 Key: LUCENE-2025
 URL: https://issues.apache.org/jira/browse/LUCENE-2025
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/index
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
  Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, 
 mentor
 Fix For: 4.1


 It would be really good in combination with parallel indexing if the
 Lucene store could be turned off entirely for an index. 
 The reason is that part of the store is the FieldIndex (.fdx file),
 which contains an 8 bytes pointer for each document in a segment, even
 if a document does not contain any stored fields.
 With parallel indexing we will want to rewrite certain parallel
 indexes to update them, and if such an update affects only a small
 number of documents it will be a waste if you have to write the .fdx
 file every time.
 So in the case where you only want to update a data structure in the
 inverted index it makes sense to separate your index into multiple
 parallel indexes, where the ones you want to update don't contain any
 stored fields.
 It'd be also great to not only allow turning off the store but to make
 it customizable, similarly to what flexible indexing wants to achieve
 regarding the inverted index.
 As a start I'd be happy with the ability to simply turn off the store and to
 add more flexibility later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1823) QueryParser with new features for Lucene 3

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-1823:


Fix Version/s: (was: 4.0)
   4.1

moving this over to 4.1 it seems dead to me though

 QueryParser with new features for Lucene 3
 --

 Key: LUCENE-1823
 URL: https://issues.apache.org/jira/browse/LUCENE-1823
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/queryparser
Reporter: Michael Busch
Assignee: Luis Alves
Priority: Minor
 Fix For: 4.1

 Attachments: lucene_1823_any_opaque_precedence_fuzzybug_v2.patch, 
 lucene_1823_foo_bug_08_26_2009.patch


 I'd like to have a new QueryParser implementation in Lucene 3.1, ideally 
 based on the new QP framework in contrib. It should share as much code as 
 possible with the current StandardQueryParser implementation for easy 
 maintainability.
 Wish list (feel free to extend):
 1. *Operator precedence*: Support operator precedence for boolean operators
 2. *Opaque terms*: Ability to plugin an external parser for certain syntax 
 extensions, e.g. XML query terms
 3. *Improved RangeQuery syntax*: Use more intuitive =, =, = instead of [] 
 and {}
 4. *Support for trierange queries*: See LUCENE-1768
 5. *Complex phrases*: See LUCENE-1486
 6. *ANY operator*: E.g. (a b c d) ANY 3 should match if 3 of the 4 terms 
 occur in the same document
 7. *New syntax for Span queries*: I think the surround parser supports this?
 8. *Escaped wildcards*: See LUCENE-588

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2443) Don't assume IntsRef offset is 0 after postings bulk read

2012-05-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281852#comment-13281852
 ] 

Michael McCandless commented on LUCENE-2443:


Yeah, definitely invalid: no more bulk postings API!

 Don't assume IntsRef offset is 0 after postings bulk read
 -

 Key: LUCENE-2443
 URL: https://issues.apache.org/jira/browse/LUCENE-2443
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


 Yonik found 2 places where we assume the ints starts at offset=0 after bulk 
 read -- we can't do this because in general a codec can give us a slice into 
 private int[] arrays, eg int block codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2443) Don't assume IntsRef offset is 0 after postings bulk read

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-2443.
-

Resolution: Invalid

we don't have a bulk api anymore... invalid

 Don't assume IntsRef offset is 0 after postings bulk read
 -

 Key: LUCENE-2443
 URL: https://issues.apache.org/jira/browse/LUCENE-2443
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


 Yonik found 2 places where we assume the ints starts at offset=0 after bulk 
 read -- we can't do this because in general a codec can give us a slice into 
 private int[] arrays, eg int block codec.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3197) Allow firstSearcher and newSearcher listeners to run in multiple threads

2012-05-23 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281866#comment-13281866
 ] 

Tommaso Teofili commented on SOLR-3197:
---

I think, as Marks was saying, we could expose an option to define if warmups 
run on one or more threads, default to 1.
This would just not change things as they are now and people could just 
explicitly set the no. of threads for warmup.

 Allow firstSearcher and newSearcher listeners to run in multiple threads
 

 Key: SOLR-3197
 URL: https://issues.apache.org/jira/browse/SOLR-3197
 Project: Solr
  Issue Type: Improvement
Reporter: Lance Norskog

 SolrCore submits all listeners (firstSearcher and newSearcher) to a java 
 ExecutorService, but uses a single-threaded one. 
 line 965 in the trunk: 
 {code}
 SolrCore.java around line 965: final ExecutorService searcherExecutor = 
 Executors.newSingleThreadExecutor(); 
 line 1280 in the trunk: 
 SolrCore.java around line 1280 runs first the, and then the first and new 
 searchers, all with the searcherExecutor object created at line 965. 
 Would it work if we changed this ExecutorService to a thread pool version? 
 This seems like it should work:
 {code}
 java.util.concurrent.Executors.newFixedThreadPool(int nThreads, ThreadFactory 
 threadFactory);
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2436) FilterIndexReader doesn't delegate everything necessary

2012-05-23 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-2436.
---

Resolution: Fixed

This was fixed by the IndexReader to AtomucReader refactoring a while ago.

 FilterIndexReader doesn't delegate everything necessary
 ---

 Key: LUCENE-2436
 URL: https://issues.apache.org/jira/browse/LUCENE-2436
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Yonik Seeley
 Fix For: 4.0


 Some new methods like fields() aren't delegated by FilterIndexReader, 
 incorrectly resulting in the IndexReader base class method being used.  We 
 should audit all current IndexReader methods to determine which should be 
 overridden and delegated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3483) Hability to change Similarity in MoreLikeThisComponent

2012-05-23 Thread Emmanuel Espina (JIRA)
Emmanuel Espina created SOLR-3483:
-

 Summary: Hability to change Similarity in MoreLikeThisComponent
 Key: SOLR-3483
 URL: https://issues.apache.org/jira/browse/SOLR-3483
 Project: Solr
  Issue Type: New Feature
  Components: MoreLikeThis
Reporter: Emmanuel Espina
Priority: Minor
 Fix For: 4.0


LUCENE-896 added support for changing the Similarity class of More
Like This in Lucene but this functionality has not been exposed to Solr. 
This issue aims to extend the MoreLikeThisComponent to support this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2308) Separately specify a field's type

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281892#comment-13281892
 ] 

Simon Willnauer commented on LUCENE-2308:
-

can we close this issue? seems like except of yoniks last comment everything 
else has been resolved?

 Separately specify a field's type
 -

 Key: LUCENE-2308
 URL: https://issues.apache.org/jira/browse/LUCENE-2308
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2308-10.patch, LUCENE-2308-11.patch, 
 LUCENE-2308-12.patch, LUCENE-2308-13.patch, LUCENE-2308-14.patch, 
 LUCENE-2308-15.patch, LUCENE-2308-16.patch, LUCENE-2308-17.patch, 
 LUCENE-2308-18.patch, LUCENE-2308-19.patch, LUCENE-2308-2.patch, 
 LUCENE-2308-20.patch, LUCENE-2308-21.patch, LUCENE-2308-3.patch, 
 LUCENE-2308-4.patch, LUCENE-2308-5.patch, LUCENE-2308-6.patch, 
 LUCENE-2308-7.patch, LUCENE-2308-8.patch, LUCENE-2308-9.patch, 
 LUCENE-2308-FT-interface.patch, LUCENE-2308-FT-interface.patch, 
 LUCENE-2308-FT-interface.patch, LUCENE-2308-FT-interface.patch, 
 LUCENE-2308-branch.patch, LUCENE-2308-final.patch, LUCENE-2308-ltc.patch, 
 LUCENE-2308-merge-1.patch, LUCENE-2308-merge-2.patch, 
 LUCENE-2308-merge-3.patch, LUCENE-2308.branchdiffs, 
 LUCENE-2308.branchdiffs.moved, LUCENE-2308.patch, LUCENE-2308.patch, 
 LUCENE-2308.patch, LUCENE-2308.patch, LUCENE-2308.patch


 This came up from dicussions on IRC.  I'm summarizing here...
 Today when you make a Field to add to a document you can set things
 index or not, stored or not, analyzed or not, details like omitTfAP,
 omitNorms, index term vectors (separately controlling
 offsets/positions), etc.
 I think we should factor these out into a new class (FieldType?).
 Then you could re-use this FieldType instance across multiple fields.
 The Field instance would still hold the actual value.
 We could then do per-field analyzers by adding a setAnalyzer on the
 FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
 for per-field codecs (with flex), where we now have
 PerFieldCodecWrapper).
 This would NOT be a schema!  It's just refactoring what we already
 specify today.  EG it's not serialized into the index.
 This has been discussed before, and I know Michael Busch opened a more
 ambitious (I think?) issue.  I think this is a good first baby step.  We could
 consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
 off on that for starters...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4075) Crazy checkout paths break TestXPathEntityProcessor

2012-05-23 Thread Greg Bowyer (JIRA)
Greg Bowyer created LUCENE-4075:
---

 Summary: Crazy checkout paths break TestXPathEntityProcessor
 Key: LUCENE-4075
 URL: https://issues.apache.org/jira/browse/LUCENE-4075
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Greg Bowyer


Same as a bug I raised for javadoc generation, my build.xml is the same as 
upstream, the problem is my checkout path looks like this
/home/buildserver/workspace/builds/{search-engineering}solr-lucene{trunk}

This means that the prepare-webpages target gets its paths in the buildpaths 
variable as a pipe separated list like so

/home/buildserver/workspace/builds/{search-engineering}solr-lucene{trunk}/lucene/analysis/common/build.xml|/home/buildserver/workspace/builds/{search-engineering}solr-lucene{trunk}/lucene/analysis/icu/build.xml|...(and
 so on)

Attached is a patch that makes TestXPathEntityProcessor use a url rather than 
the filesystem path that makes XPath / xml happier with crazy path names

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4075) Crazy checkout paths break TestXPathEntityProcessor

2012-05-23 Thread Greg Bowyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Bowyer updated LUCENE-4075:


Attachment: LUCENE-4075-TestXPathEntityProcessor-WierdPath-Fix.patch

 Crazy checkout paths break TestXPathEntityProcessor
 ---

 Key: LUCENE-4075
 URL: https://issues.apache.org/jira/browse/LUCENE-4075
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Greg Bowyer
 Attachments: LUCENE-4075-TestXPathEntityProcessor-WierdPath-Fix.patch


 Same as a bug I raised for javadoc generation, my build.xml is the same as 
 upstream, the problem is my checkout path looks like this
 /home/buildserver/workspace/builds/{search-engineering}solr-lucene{trunk}
 This means that the prepare-webpages target gets its paths in the buildpaths 
 variable as a pipe separated list like so
 /home/buildserver/workspace/builds/{search-engineering}solr-lucene{trunk}/lucene/analysis/common/build.xml|/home/buildserver/workspace/builds/{search-engineering}solr-lucene{trunk}/lucene/analysis/icu/build.xml|...(and
  so on)
 Attached is a patch that makes TestXPathEntityProcessor use a url rather than 
 the filesystem path that makes XPath / xml happier with crazy path names

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: wiki software

2012-05-23 Thread Grant Ingersoll

On May 19, 2012, at 5:55 PM, Ryan McKinley wrote:

 A *long* time ago we discussed converting to confluence to replace the
 forest site.  The key issue was that only commiters could have access
 if we want to include the generated PDF in the distribution.  This is
 all moot now that we have ditched forest.
 
 Since then the discussion has come up and I think everyone is in favor
 of the idea, but noone has taken the steps to make it happen.  I
 suggest we:
 
 1. Create infra JIRA issue to:
 * delete the old https://cwiki.apache.org/SOLRxSITE/  test
 * create https://cwiki.apache.org/SOLR
 * create https://cwiki.apache.org/LUCENE

+1


 
 2. Convert existing sites using
 https://studio.plugins.atlassian.com/wiki/display/UWC/UWC+MoinMoin+Notes
 I don't know if this is something we can do, or we can make an infra
 JIRA issue for

I'd actually argue we skip this, kind of.  I'd like to see us have a left hand 
nav that represents the versions and then we copy the docs into each version 
and then go through and make sure everything jives per version.  While this is 
more work up front, I think in the long run, it will result in a much better 
experience for our users.

 
 3. replace existing MoinMoin sites with links to cwiki
 https://wiki.apache.org/jakarta-lucene/
 https://wiki.apache.org/solr/
 
 
 ryan
 
 
 
 
 
 On Sat, May 19, 2012 at 12:48 PM, Mark Miller markrmil...@gmail.com wrote:
 I know there was a long debate about wiki software and docs and what not. It 
 got long enough that I petered out on it.
 
 In some ways, I guess this is a lazy plea for someone that did follow along 
 to summarize. Did we get anywhere? Is there an action item to start on?
 
 I'm in the same spot I was when I started that thread - the first bite I'm 
 after is switching from the dated moin moin to the modern confluence. It 
 seems as easy as opening a JIRA issue to get a confluence space up.
 
 Should we just do that and start migrating, and take further leaps from 
 there?
 
 Or is there some fallout from the previous debate that should be 
 incorporated into the next move?
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


Grant Ingersoll
http://www.lucidimagination.com





[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments

2012-05-23 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281897#comment-13281897
 ] 

Simon Willnauer commented on LUCENE-3440:
-

Koji, do you wanna get this in any time? Now is likely a good time since 4.0 is 
getting close. We won't apply this to 3.6.1 since that is a bugfix only release 
if it is going to happen at all.

 FastVectorHighlighter: IDF-weighted terms for ordered fragments 
 

 Key: LUCENE-3440
 URL: https://issues.apache.org/jira/browse/LUCENE-3440
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: sebastian L.
Priority: Minor
  Labels: FastVectorHighlighter
 Fix For: 4.0

 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, 
 LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, 
 weight-vs-boost_table01.html, weight-vs-boost_table02.html


 The FastVectorHighlighter uses for every term found in a fragment an equal 
 weight, which causes a higher ranking for fragments with a high number of 
 words or, in the worst case, a high number of very common words than 
 fragments that contains *all* of the terms used in the original query. 
 This patch provides ordered fragments with IDF-weighted terms: 
 total weight = total weight + IDF for unique term per fragment * boost of 
 query; 
 The ranking-formula should be the same, or at least similar, to that one used 
 in org.apache.lucene.search.highlight.QueryTermScorer.
 The patch is simple, but it works for us. 
 Some ideas:
 - A better approach would be moving the whole fragments-scoring into a 
 separate class.
 - Switch scoring via parameter 
 - Exact phrases should be given a even better score, regardless if a 
 phrase-query was executed or not
 - edismax/dismax-parameters pf, ps and pf^boost should be observed and 
 corresponding fragments should be ranked higher 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments

2012-05-23 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3440:


Fix Version/s: (was: 3.6.1)

remove 3.6.1 from fix version - bugfix only relase

 FastVectorHighlighter: IDF-weighted terms for ordered fragments 
 

 Key: LUCENE-3440
 URL: https://issues.apache.org/jira/browse/LUCENE-3440
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Reporter: sebastian L.
Priority: Minor
  Labels: FastVectorHighlighter
 Fix For: 4.0

 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, 
 LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, 
 weight-vs-boost_table01.html, weight-vs-boost_table02.html


 The FastVectorHighlighter uses for every term found in a fragment an equal 
 weight, which causes a higher ranking for fragments with a high number of 
 words or, in the worst case, a high number of very common words than 
 fragments that contains *all* of the terms used in the original query. 
 This patch provides ordered fragments with IDF-weighted terms: 
 total weight = total weight + IDF for unique term per fragment * boost of 
 query; 
 The ranking-formula should be the same, or at least similar, to that one used 
 in org.apache.lucene.search.highlight.QueryTermScorer.
 The patch is simple, but it works for us. 
 Some ideas:
 - A better approach would be moving the whole fragments-scoring into a 
 separate class.
 - Switch scoring via parameter 
 - Exact phrases should be given a even better score, regardless if a 
 phrase-query was executed or not
 - edismax/dismax-parameters pf, ps and pf^boost should be observed and 
 corresponding fragments should be ranked higher 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #173

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/173/changes

Changes:

[simonw] LUCENE-4074: FST Sorter BufferSize causes int overflow if BufferSize  
2048MB

--
[...truncated 10387 lines...]
   [junit4]   2 27640 T3358 oasc.RequestHandlers.initHandlersFromConfig 
created spellCheckCompRH: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 27640 T3358 oasc.RequestHandlers.initHandlersFromConfig 
created spellCheckCompRH_Direct: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 27640 T3358 oasc.RequestHandlers.initHandlersFromConfig 
created spellCheckCompRH1: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 27641 T3358 oasc.RequestHandlers.initHandlersFromConfig 
created tvrh: org.apache.solr.handler.component.SearchHandler
   [junit4]   2 27641 T3358 oasc.RequestHandlers.initHandlersFromConfig 
created /mlt: solr.MoreLikeThisHandler
   [junit4]   2 27641 T3358 oasc.RequestHandlers.initHandlersFromConfig 
created /debug/dump: solr.DumpRequestHandler
   [junit4]   2 27642 T3358 oashl.XMLLoader.init xsltCacheLifetimeSeconds=60
   [junit4]   2 27645 T3358 oasc.SolrCore.initDeprecatedSupport WARNING 
solrconfig.xml uses deprecated admin/gettableFiles, Please update your config 
to use the ShowFileRequestHandler.
   [junit4]   2 27646 T3358 oasc.SolrCore.initDeprecatedSupport WARNING adding 
ShowFileRequestHandler with hidden files: [SOLRCONFIG-HIGHLIGHT.XML, 
SCHEMA-REQUIRED-FIELDS.XML, SCHEMA-REPLICATION2.XML, SCHEMA-MINIMAL.XML, 
BAD-SCHEMA-DUP-DYNAMICFIELD.XML, SOLRCONFIG-CACHING.XML, 
SOLRCONFIG-REPEATER.XML, CURRENCY.XML, BAD-SCHEMA-NONTEXT-ANALYZER.XML, 
SOLRCONFIG-MERGEPOLICY.XML, SOLRCONFIG-TLOG.XML, SOLRCONFIG-MASTER.XML, 
SCHEMA11.XML, SOLRCONFIG-BASIC.XML, DA_COMPOUNDDICTIONARY.TXT, 
SCHEMA-COPYFIELD-TEST.XML, SOLRCONFIG-SLAVE.XML, ELEVATE.XML, 
SOLRCONFIG-PROPINJECT-INDEXDEFAULT.XML, SCHEMA-IB.XML, 
SOLRCONFIG-QUERYSENDER.XML, SCHEMA-REPLICATION1.XML, DA_UTF8.XML, 
HYPHENATION.DTD, SOLRCONFIG-ENABLEPLUGIN.XML, STEMDICT.TXT, 
SCHEMA-PHRASESUGGEST.XML, HUNSPELL-TEST.AFF, STOPTYPES-1.TXT, 
STOPWORDSWRONGENCODING.TXT, SCHEMA-NUMERIC.XML, SOLRCONFIG-TRANSFORMERS.XML, 
SOLRCONFIG-PROPINJECT.XML, BAD-SCHEMA-NOT-INDEXED-BUT-TF.XML, 
SOLRCONFIG-SIMPLELOCK.XML, WDFTYPES.TXT, STOPTYPES-2.TXT, SCHEMA-REVERSED.XML, 
SOLRCONFIG-SPELLCHECKCOMPONENT.XML, SCHEMA-DFR.XML, 
SOLRCONFIG-PHRASESUGGEST.XML, BAD-SCHEMA-NOT-INDEXED-BUT-POS.XML, KEEP-1.TXT, 
OPEN-EXCHANGE-RATES.JSON, STOPWITHBOM.TXT, SCHEMA-BINARYFIELD.XML, 
SOLRCONFIG-SPELLCHECKER.XML, SOLRCONFIG-UPDATE-PROCESSOR-CHAINS.XML, 
BAD-SCHEMA-OMIT-TF-BUT-NOT-POS.XML, BAD-SCHEMA-DUP-FIELDTYPE.XML, 
SOLRCONFIG-MASTER1.XML, SYNONYMS.TXT, SCHEMA.XML, SCHEMA_CODEC.XML, 
SOLRCONFIG-SOLR-749.XML, SOLRCONFIG-MASTER1-KEEPONEBACKUP.XML, STOP-2.TXT, 
SOLRCONFIG-FUNCTIONQUERY.XML, SCHEMA-LMDIRICHLET.XML, SOLRCONFIG-TERMINDEX.XML, 
SOLRCONFIG-ELEVATE.XML, STOPWORDS.TXT, SCHEMA-FOLDING.XML, 
SCHEMA-STOP-KEEP.XML, BAD-SCHEMA-NOT-INDEXED-BUT-NORMS.XML, 
SOLRCONFIG-SOLCOREPROPERTIES.XML, STOP-1.TXT, SOLRCONFIG-MASTER2.XML, 
SCHEMA-SPELLCHECKER.XML, SOLRCONFIG-LAZYWRITER.XML, 
SCHEMA-LUCENEMATCHVERSION.XML, BAD-MP-SOLRCONFIG.XML, FRENCHARTICLES.TXT, 
SCHEMA15.XML, SOLRCONFIG-REQHANDLER.INCL, SCHEMASURROUND.XML, 
SCHEMA-COLLATEFILTER.XML, SOLRCONFIG-MASTER3.XML, HUNSPELL-TEST.DIC, 
SOLRCONFIG-XINCLUDE.XML, SOLRCONFIG-DELPOLICY1.XML, SOLRCONFIG-SLAVE1.XML, 
SCHEMA-SIM.XML, SCHEMA-COLLATE.XML, STOP-SNOWBALL.TXT, PROTWORDS.TXT, 
SCHEMA-TRIE.XML, SOLRCONFIG_CODEC.XML, SCHEMA-TFIDF.XML, 
SCHEMA-LMJELINEKMERCER.XML, PHRASESUGGEST.TXT, 
SOLRCONFIG-BASIC-LUCENEVERSION31.XML, OLD_SYNONYMS.TXT, 
SOLRCONFIG-DELPOLICY2.XML, XSLT, SOLRCONFIG-NATIVELOCK.XML, 
BAD-SCHEMA-DUP-FIELD.XML, SOLRCONFIG-NOCACHE.XML, SCHEMA-BM25.XML, 
SOLRCONFIG-ALTDIRECTORY.XML, SOLRCONFIG-QUERYSENDER-NOQUERY.XML, 
COMPOUNDDICTIONARY.TXT, SOLRCONFIG_PERF.XML, 
SCHEMA-NOT-REQUIRED-UNIQUE-KEY.XML, KEEP-2.TXT, SCHEMA12.XML, 
MAPPING-ISOLATIN1ACCENT.TXT, BAD_SOLRCONFIG.XML, 
BAD-SCHEMA-EXTERNAL-FILEFIELD.XML]
   [junit4]   2 27651 T3358 oass.SolrIndexSearcher.init Opening 
Searcher@b8c765f main
   [junit4]   2 27651 T3358 oass.SolrIndexSearcher.init WARNING WARNING: 
Directory impl does not support setting indexDir: 
org.apache.lucene.store.MockDirectoryWrapper
   [junit4]   2 27651 T3358 oasu.CommitTracker.init Hard AutoCommit: disabled
   [junit4]   2 27654 T3358 oasu.CommitTracker.init Soft AutoCommit: disabled
   [junit4]   2 27654 T3358 oashc.SpellCheckComponent.inform Initializing 
spell checkers
   [junit4]   2 27675 T3358 oass.DirectSolrSpellChecker.init init: 
{name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3}
   [junit4]   2 27750 T3358 oashc.HttpShardHandlerFactory.getParameter Setting 
socketTimeout to: 0
   [junit4]   2 27750 T3358 oashc.HttpShardHandlerFactory.getParameter Setting 
urlScheme to: http://
   [junit4]   2 27750 T3358 

[jira] [Commented] (LUCENE-4006) system requirements is duplicated across versioned/unversioned

2012-05-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281904#comment-13281904
 ] 

Uwe Schindler commented on LUCENE-4006:
---

Robert: The forrested one is now gone as we used a chainsaw, right? So I think 
we can close this issue :-)

 system requirements is duplicated across versioned/unversioned
 --

 Key: LUCENE-4006
 URL: https://issues.apache.org/jira/browse/LUCENE-4006
 Project: Lucene - Java
  Issue Type: Task
  Components: general/javadocs
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.0


 Our System requirements page is located here on the unversioned site: 
 http://lucene.apache.org/core/systemreqs.html
 But its also in forrest under each release. Can we just nuke the forrested 
 one?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4006) system requirements is duplicated across versioned/unversioned

2012-05-23 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13281906#comment-13281906
 ] 

Uwe Schindler commented on LUCENE-4006:
---

Or maybe better move the system requirements per-release...? Thats why this 
issue is still open.

 system requirements is duplicated across versioned/unversioned
 --

 Key: LUCENE-4006
 URL: https://issues.apache.org/jira/browse/LUCENE-4006
 Project: Lucene - Java
  Issue Type: Task
  Components: general/javadocs
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.0


 Our System requirements page is located here on the unversioned site: 
 http://lucene.apache.org/core/systemreqs.html
 But its also in forrest under each release. Can we just nuke the forrested 
 one?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #102

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/102/changes

Changes:

[simonw] LUCENE-4074: FST Sorter BufferSize causes int overflow if BufferSize  
2048MB

[simonw] LUCENE-4018: Make MappingMultiDocsEnum subenums accessible

--
[...truncated 11726 lines...]
   [junit4] Completed in 1.32s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.LeaderElectionTest
   [junit4] Completed in 26.17s, 4 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.search.similarities.TestDefaultSimilarityFactory
   [junit4] Completed in 0.16s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.analysis.TestNorwegianMinimalStemFilterFactory
   [junit4] Completed in 0.01s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.OverseerTest
   [junit4] Completed in 63.07s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.NodeStateWatcherTest
   [junit4] Completed in 23.86s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.BasicDistributedZkTest
   [junit4] Completed in 59.09s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestPhoneticFilterFactory
   [junit4] Completed in 8.99s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.TestDistributedGrouping
   [junit4] Completed in 26.93s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedSpellCheckComponentTest
   [junit4] Completed in 21.67s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.BasicZkTest
   [junit4] Completed in 11.86s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.TestJoin
   [junit4] Completed in 13.22s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.ZkControllerTest
   [junit4] Completed in 17.67s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.TestGroupingSearch
   [junit4] Completed in 7.16s, 12 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.TestFunctionQuery
   [junit4] Completed in 4.15s, 14 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedQueryElevationComponentTest
   [junit4] Completed in 5.42s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterFSTTest
   [junit4] Completed in 1.49s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.SolrCoreTest
   [junit4] Completed in 5.20s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.BadIndexSchemaTest
   [junit4] Completed in 1.23s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest
   [junit4] Completed in 0.98s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.request.TestWriterPerf
   [junit4] Completed in 1.26s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields
   [junit4] Completed in 1.41s, 13 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestSurroundQueryParser
   [junit4] Completed in 0.96s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest
   [junit4] Completed in 1.88s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.distance.DistanceFunctionTest
   [junit4] Completed in 1.04s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.XsltUpdateRequestHandlerTest
   [junit4] Completed in 0.96s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.SolrCoreCheckLockOnStartupTest
   [junit4] Completed in 1.50s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestFoldingMultitermQuery
   [junit4] Completed in 1.32s, 18 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.CurrencyFieldTest
   [junit4] IGNORED 0.00s | CurrencyFieldTest.testPerformance
   [junit4] Cause: Annotated @Ignore()
   [junit4] Completed in 1.15s, 8 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy1
   [junit4] IGNOR/A 0.04s | TestSolrDeletionPolicy1.testCommitAge
   [junit4] Assumption #1: This test is not working on Windows (or maybe 
machines with only 2 CPUs)
   [junit4]   2 1005 T3145 oas.SolrTestCaseJ4.setUp ###Starting testCommitAge
   [junit4]   2 1011 T3145 C60 oasu.DirectUpdateHandler2.deleteAll 
[collection1] REMOVING ALL DOCUMENTS FROM INDEX
   [junit4]   2 1012 T3145 C60 UPDATE [collection1] webapp=null path=null 
params={} {deleteByQuery=*:*} 0 1
   [junit4]   2 1014 T3145 oas.SolrTestCaseJ4.tearDown ###Ending testCommitAge
   [junit4]   2
   [junit4] Completed in 1.25s, 3 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.update.SolrIndexConfigTest
   [junit4] Completed in 1.66s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.XmlUpdateRequestHandlerTest
   [junit4] Completed in 0.96s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.DebugComponentTest
   [junit4] Completed in 1.30s, 2 tests
   [junit4]  
   [junit4] Suite: 

Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #174

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/174/changes


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #103

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/103/changes


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1489) highlighter problem with n-gram tokens

2012-05-23 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282040#comment-13282040
 ] 

Lance Norskog commented on LUCENE-1489:
---

Is this still a problem?

 highlighter problem with n-gram tokens
 --

 Key: LUCENE-1489
 URL: https://issues.apache.org/jira/browse/LUCENE-1489
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/highlighter
Reporter: Koji Sekiguchi
Priority: Minor
 Attachments: LUCENE-1489.patch, lucene1489.patch


 I have a problem when using n-gram and highlighter. I thought it had been 
 solved in LUCENE-627...
 Actually, I found this problem when I was using CJKTokenizer on Solr, though, 
 here is lucene program to reproduce it using NGramTokenizer(min=2,max=2) 
 instead of CJKTokenizer:
 {code:java}
 public class TestNGramHighlighter {
   public static void main(String[] args) throws Exception {
 Analyzer analyzer = new NGramAnalyzer();
 final String TEXT = Lucene can make index. Then Lucene can search.;
 final String QUERY = can;
 QueryParser parser = new QueryParser(f,analyzer);
 Query query = parser.parse(QUERY);
 QueryScorer scorer = new QueryScorer(query,f);
 Highlighter h = new Highlighter( scorer );
 System.out.println( h.getBestFragment(analyzer, f, TEXT) );
   }
   static class NGramAnalyzer extends Analyzer {
 public TokenStream tokenStream(String field, Reader input) {
   return new NGramTokenizer(input,2,2);
 }
   }
 }
 {code}
 expected output is:
 Lucene Bcan/B make index. Then Lucene Bcan/B search.
 but the actual output is:
 Lucene Bcan make index. Then Lucene can/B search.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2694) LogUpdateProcessor not thread safe

2012-05-23 Thread Ethan Tao (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13282041#comment-13282041
 ] 

Ethan Tao commented on SOLR-2694:
-

We've decided to manually apply the patch from Solr-2804 dated 1/1/12 to 
current Snapshot.
If this patch won't become official, at least there should be a new class 
ConcurrentLogUpdateProcessorFactory to handle the thread safe issue. We'll 
file a new bug for it.
Thanks.

 LogUpdateProcessor not thread safe
 --

 Key: SOLR-2694
 URL: https://issues.apache.org/jira/browse/SOLR-2694
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.4.1, 3.1, 3.2, 3.3, 4.0
Reporter: Jan Høydahl

 Using the LogUpdateProcessor while feeding in multiple parallell threads does 
 not work, as LogUpdateProcessor is not threadsafe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Build failed in Jenkins: Lucene-Solr-trunk-Linux-Java6-64 #500

2012-05-23 Thread jenkins
See 
http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java6-64/500/

--
[...truncated 11820 lines...]
   [junit4] Suite: org.apache.solr.update.AutoCommitTest
   [junit4] Completed on J1 in 8.51s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.util.SolrPluginUtilsTest
   [junit4] Completed on J1 in 0.52s, 7 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.FullSolrCloudDistribCmdsTest
   [junit4] Completed on J0 in 21.00s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.ZkSolrClientTest
   [junit4] Completed on J0 in 6.40s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.internal.csv.writer.CSVWriterTest
   [junit4] Completed on J1 in 0.00s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.TestReplicationHandler
   [junit4] Completed on J1 in 22.93s, 1 test
   [junit4]  
   [junit4] Suite: 
org.apache.solr.handler.component.DistributedSpellCheckComponentTest
   [junit4] Completed on J0 in 6.93s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.TestHashPartitioner
   [junit4] Completed on J0 in 5.12s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.cloud.ZkControllerTest
   [junit4] Completed on J1 in 7.10s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.TestFunctionQuery
   [junit4] Completed on J0 in 1.98s, 14 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterFSTTest
   [junit4] Completed on J1 in 0.73s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest
   [junit4] Completed on J1 in 0.67s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.TestTrie
   [junit4] Completed on J1 in 0.82s, 8 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.BadIndexSchemaTest
   [junit4] Completed on J1 in 0.83s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestSort
   [junit4] Completed on J0 in 3.92s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestJmxIntegration
   [junit4] IGNORED 0.00s J0 | TestJmxIntegration.testJmxOnCoreReload
   [junit4] Cause: Annotated @Ignore(timing problem? 
https://issues.apache.org/jira/browse/SOLR-2715)
   [junit4] Completed on J0 in 1.09s, 3 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.component.StatsComponentTest
   [junit4] Completed on J1 in 3.12s, 6 tests
   [junit4]  
   [junit4] Suite: 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest
   [junit4] Completed on J1 in 0.78s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.BasicFunctionalityTest
   [junit4] IGNORED 0.00s J0 | BasicFunctionalityTest.testDeepPaging
   [junit4] Cause: Annotated @Ignore(See SOLR-1726)
   [junit4] Completed on J0 in 1.74s, 23 tests, 1 skipped
   [junit4]  
   [junit4] Suite: org.apache.solr.request.TestWriterPerf
   [junit4] Completed on J0 in 0.87s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestCoreContainer
   [junit4] Completed on J1 in 1.68s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.handler.CSVRequestHandlerTest
   [junit4] Completed on J1 in 0.68s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.highlight.HighlighterTest
   [junit4] Completed on J0 in 1.64s, 27 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.function.distance.DistanceFunctionTest
   [junit4] Completed on J0 in 0.70s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.spelling.SpellCheckCollatorTest
   [junit4] Completed on J1 in 1.32s, 6 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.PolyFieldTest
   [junit4] Completed on J1 in 0.77s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.SpatialFilterTest
   [junit4] Completed on J0 in 1.06s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.servlet.CacheHeaderTest
   [junit4] Completed on J1 in 0.62s, 5 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.TestOmitPositions
   [junit4] Completed on J0 in 0.55s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy1
   [junit4] Completed on J1 in 0.73s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.update.SolrIndexConfigTest
   [junit4] Completed on J0 in 0.87s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.analysis.TestReversedWildcardFilterFactory
   [junit4] Completed on J1 in 0.52s, 4 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestQueryUtils
   [junit4] Completed on J1 in 0.59s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.search.TestIndexSearcher
   [junit4] Completed on J0 in 1.29s, 2 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.DisMaxRequestHandlerTest
   [junit4] Completed on J1 in 0.62s, 3 tests
   [junit4]  
   [junit4] Suite: org.apache.solr.response.TestCSVResponseWriter
   [junit4] Completed on J0 in 0.49s, 1 test
   [junit4]  
   [junit4] Suite: org.apache.solr.schema.IndexSchemaTest
   

[jira] [Created] (SOLR-3484) LogUpdateProcessor throws ConcurrentModificationException under multi-threading calls

2012-05-23 Thread Ethan Tao (JIRA)
Ethan Tao created SOLR-3484:
---

 Summary: LogUpdateProcessor throws ConcurrentModificationException 
under multi-threading calls 
 Key: SOLR-3484
 URL: https://issues.apache.org/jira/browse/SOLR-3484
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.0
 Environment: linux
Reporter: Ethan Tao


Using the LogUpdateProcessor in a singleton chain for concurrent processing 
throws exception. The issue has been reported in SOLR-2694 (closed), and an 
unoffical patch can be found in related bug-id SOLR-2804 patch dated 1/1/12.

If the patch won't become official for LogUpdateProcessor, suggested to have 
new class ConcurrentLogUpdateProcessorFactory to address the thread safe 
issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: wiki software

2012-05-23 Thread Jan Høydahl
Agree with Grant that we need a better, more structured documentation set, per 
version.
But let's first do what Ryan suggests - switch to Confluence - put all 
converted MoinMoin pages in a separate legacy section of the Wiki.

Then in phase II start an effort do do a fresh documentation for version 4.0 
only, but first outlining the structure and placeholder pages and then filling 
in the meat. A good thing about Confluence is that we could probably use macros 
to link to SVN and Javadoc and in various ways auto genereate parts of the docs.

This way we can migrate Wiki software without being held up by the need to 
rewrite everything, and we do not need to keep updating two systems.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 23. mai 2012, at 22:49, Grant Ingersoll wrote:

 
 On May 19, 2012, at 5:55 PM, Ryan McKinley wrote:
 
 A *long* time ago we discussed converting to confluence to replace the
 forest site.  The key issue was that only commiters could have access
 if we want to include the generated PDF in the distribution.  This is
 all moot now that we have ditched forest.
 
 Since then the discussion has come up and I think everyone is in favor
 of the idea, but noone has taken the steps to make it happen.  I
 suggest we:
 
 1. Create infra JIRA issue to:
 * delete the old https://cwiki.apache.org/SOLRxSITE/  test
 * create https://cwiki.apache.org/SOLR
 * create https://cwiki.apache.org/LUCENE
 
 +1
 
 
 
 2. Convert existing sites using
 https://studio.plugins.atlassian.com/wiki/display/UWC/UWC+MoinMoin+Notes
 I don't know if this is something we can do, or we can make an infra
 JIRA issue for
 
 I'd actually argue we skip this, kind of.  I'd like to see us have a left 
 hand nav that represents the versions and then we copy the docs into each 
 version and then go through and make sure everything jives per version.  
 While this is more work up front, I think in the long run, it will result in 
 a much better experience for our users.
 
 
 3. replace existing MoinMoin sites with links to cwiki
 https://wiki.apache.org/jakarta-lucene/
 https://wiki.apache.org/solr/
 
 
 ryan
 
 
 
 
 
 On Sat, May 19, 2012 at 12:48 PM, Mark Miller markrmil...@gmail.com wrote:
 I know there was a long debate about wiki software and docs and what not. 
 It got long enough that I petered out on it.
 
 In some ways, I guess this is a lazy plea for someone that did follow along 
 to summarize. Did we get anywhere? Is there an action item to start on?
 
 I'm in the same spot I was when I started that thread - the first bite I'm 
 after is switching from the dated moin moin to the modern confluence. It 
 seems as easy as opening a JIRA issue to get a confluence space up.
 
 Should we just do that and start migrating, and take further leaps from 
 there?
 
 Or is there some fallout from the previous debate that should be 
 incorporated into the next move?
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 Grant Ingersoll
 http://www.lucidimagination.com
 
 
 



  1   2   >