Re: solr limits
hello, plz clarify documents means unique id's or something else lets say i have file indexed each file no. is unique so file count will b 2.14 billions assume i have content in database as records each record have unique id so record count will be 2.14 billions m i right? -- Thanks Regards Sachin Aggarwal 7760502772
Re: Apache Lucene Eurocon 2012
Ok. Do you know when and where Lucene Eurocon 2012 gonna happen? On Wed, Jun 20, 2012 at 10:16 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: up -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: solr limits
Hi, One index records is one documents along with one unique id. like in database one rows is one document is solr. On Thu, Jun 21, 2012 at 11:39 AM, Sachin Aggarwal different.sac...@gmail.com wrote: hello, plz clarify documents means unique id's or something else lets say i have file indexed each file no. is unique so file count will b 2.14 billions assume i have content in database as records each record have unique id so record count will be 2.14 billions m i right? -- Thanks Regards Sachin Aggarwal 7760502772
Re: solr limits
thanks .. On Thu, Jun 21, 2012 at 11:51 AM, irshad siddiqui irshad.s...@gmail.comwrote: Hi, One index records is one documents along with one unique id. like in database one rows is one document is solr. On Thu, Jun 21, 2012 at 11:39 AM, Sachin Aggarwal different.sac...@gmail.com wrote: hello, plz clarify documents means unique id's or something else lets say i have file indexed each file no. is unique so file count will b 2.14 billions assume i have content in database as records each record have unique id so record count will be 2.14 billions m i right? -- Thanks Regards Sachin Aggarwal 7760502772 -- Thanks Regards Sachin Aggarwal 7760502772
Re: Solr with Tomcat on VPS
Can anyone help? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990677.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with Tomcat on VPS
Hi, You have to check installation configuration. Inside tomcat webapps folder you have putted solr.war file. and within this WEB-INF folder web.xml you have to check your solr core folder url On Thu, Jun 21, 2012 at 3:34 PM, mcfly04 hil...@csc-scc.gc.ca wrote: Can anyone help? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990677.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with Tomcat on VPS
Thank you for your response. Are you referring to the SolrRequestFilter path-prefix? Here is a copy of my web.xml: - ?xml version=1.0 encoding=UTF-8? !DOCTYPE web-app PUBLIC quot;-//Sun Microsystems, Inc.//DTD Web Application 2.3//ENquot; quot;http://java.sun.com/dtd/web-app_2_3.dtdquot; web-app filter filter-nameSolrRequestFilter/filter-name filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class /filter filter-mapping filter-nameSolrRequestFilter/filter-name url-pattern/*/url-pattern /filter-mapping servlet servlet-nameSolrServer/servlet-name display-nameSolr/display-name descriptionSolr Server/description servlet-classorg.apache.solr.servlet.SolrServlet/servlet-class load-on-startup1/load-on-startup /servlet servlet servlet-nameSolrUpdate/servlet-name display-nameSolrUpdate/display-name descriptionSolr Update Handler/description servlet-classorg.apache.solr.servlet.SolrUpdateServlet/servlet-class load-on-startup2/load-on-startup /servlet servlet servlet-nameLogging/servlet-name servlet-classorg.apache.solr.servlet.LogLevelSelection/servlet-class /servlet servlet servlet-nameping/servlet-name jsp-file/admin/ping.jsp/jsp-file /servlet servlet-mapping servlet-nameSolrServer/servlet-name url-pattern/select/*/url-pattern /servlet-mapping servlet-mapping servlet-nameSolrUpdate/servlet-name url-pattern/update/*/url-pattern /servlet-mapping servlet-mapping servlet-nameLogging/servlet-name url-pattern/admin/logging/url-pattern /servlet-mapping servlet-mapping servlet-nameping/servlet-name url-pattern/admin/ping/url-pattern /servlet-mapping servlet-mapping servlet-nameLogging/servlet-name url-pattern/admin/logging.jsp/url-pattern /servlet-mapping mime-mapping extension.xsl/extension mime-typeapplication/xslt+xml/mime-type /mime-mapping welcome-file-list welcome-fileindex.jsp/welcome-file welcome-fileindex.html/welcome-file /welcome-file-list /web-app -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990687.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with Tomcat on VPS
HI, in this web.xml file you need to add below line env-entry env-entry-namesolr/home/env-entry-name env-entry-valueyour solr core path here /env-entry-value env-entry-typejava.lang.String/env-entry-type /env-entry On Thu, Jun 21, 2012 at 4:24 PM, mcfly04 hil...@csc-scc.gc.ca wrote: Thank you for your response. Are you referring to the SolrRequestFilter path-prefix? Here is a copy of my web.xml: - ?xml version=1.0 encoding=UTF-8? !DOCTYPE web-app PUBLIC quot;-//Sun Microsystems, Inc.//DTD Web Application 2.3//ENquot; quot;http://java.sun.com/dtd/web-app_2_3.dtdquot; web-app filter filter-nameSolrRequestFilter/filter-name filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class /filter filter-mapping filter-nameSolrRequestFilter/filter-name url-pattern/*/url-pattern /filter-mapping servlet servlet-nameSolrServer/servlet-name display-nameSolr/display-name descriptionSolr Server/description servlet-classorg.apache.solr.servlet.SolrServlet/servlet-class load-on-startup1/load-on-startup /servlet servlet servlet-nameSolrUpdate/servlet-name display-nameSolrUpdate/display-name descriptionSolr Update Handler/description servlet-classorg.apache.solr.servlet.SolrUpdateServlet/servlet-class load-on-startup2/load-on-startup /servlet servlet servlet-nameLogging/servlet-name servlet-classorg.apache.solr.servlet.LogLevelSelection/servlet-class /servlet servlet servlet-nameping/servlet-name jsp-file/admin/ping.jsp/jsp-file /servlet servlet-mapping servlet-nameSolrServer/servlet-name url-pattern/select/*/url-pattern /servlet-mapping servlet-mapping servlet-nameSolrUpdate/servlet-name url-pattern/update/*/url-pattern /servlet-mapping servlet-mapping servlet-nameLogging/servlet-name url-pattern/admin/logging/url-pattern /servlet-mapping servlet-mapping servlet-nameping/servlet-name url-pattern/admin/ping/url-pattern /servlet-mapping servlet-mapping servlet-nameLogging/servlet-name url-pattern/admin/logging.jsp/url-pattern /servlet-mapping mime-mapping extension.xsl/extension mime-typeapplication/xslt+xml/mime-type /mime-mapping welcome-file-list welcome-fileindex.jsp/welcome-file welcome-fileindex.html/welcome-file /welcome-file-list /web-app -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990687.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with Tomcat on VPS
Hi, you can also refer below url for solr configuration http://tek-manthan.blogspot.in/ Regards, Irshad On Thu, Jun 21, 2012 at 4:24 PM, mcfly04 hil...@csc-scc.gc.ca wrote: Thank you for your response. Are you referring to the SolrRequestFilter path-prefix? Here is a copy of my web.xml: - ?xml version=1.0 encoding=UTF-8? !DOCTYPE web-app PUBLIC quot;-//Sun Microsystems, Inc.//DTD Web Application 2.3//ENquot; quot;http://java.sun.com/dtd/web-app_2_3.dtdquot; web-app filter filter-nameSolrRequestFilter/filter-name filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class /filter filter-mapping filter-nameSolrRequestFilter/filter-name url-pattern/*/url-pattern /filter-mapping servlet servlet-nameSolrServer/servlet-name display-nameSolr/display-name descriptionSolr Server/description servlet-classorg.apache.solr.servlet.SolrServlet/servlet-class load-on-startup1/load-on-startup /servlet servlet servlet-nameSolrUpdate/servlet-name display-nameSolrUpdate/display-name descriptionSolr Update Handler/description servlet-classorg.apache.solr.servlet.SolrUpdateServlet/servlet-class load-on-startup2/load-on-startup /servlet servlet servlet-nameLogging/servlet-name servlet-classorg.apache.solr.servlet.LogLevelSelection/servlet-class /servlet servlet servlet-nameping/servlet-name jsp-file/admin/ping.jsp/jsp-file /servlet servlet-mapping servlet-nameSolrServer/servlet-name url-pattern/select/*/url-pattern /servlet-mapping servlet-mapping servlet-nameSolrUpdate/servlet-name url-pattern/update/*/url-pattern /servlet-mapping servlet-mapping servlet-nameLogging/servlet-name url-pattern/admin/logging/url-pattern /servlet-mapping servlet-mapping servlet-nameping/servlet-name url-pattern/admin/ping/url-pattern /servlet-mapping servlet-mapping servlet-nameLogging/servlet-name url-pattern/admin/logging.jsp/url-pattern /servlet-mapping mime-mapping extension.xsl/extension mime-typeapplication/xslt+xml/mime-type /mime-mapping welcome-file-list welcome-fileindex.jsp/welcome-file welcome-fileindex.html/welcome-file /welcome-file-list /web-app -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990687.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr java.lang.NullPointerException on select queries
Ah, OK, I misunderstood. OK, here's a couple of off-the-top-of-my-head ideas. make a backup of your index before anything else G... Split up your current index into two parts by segments. That is, copy the whole directory to another place, and remove some of the segments from each. I.e. when you're done, you'll still have all the segments you used to have, but some of them will be in one directory and some in another. Of course all of the segments files with a common prefix should be in place (e.g. all the _0.* files in the same dir, not split between the two dirs). Now run CheckIndex on them. That'll take a long time, but it _should_ spoof Solr/Lucene into thinking that there are two complete indexes out there. Now your idea of having an archival search should work, but with two places to look, not one. NOTE: Whether this plays nice with the over 2B docs or deleted documents I can't guarantee I believe that the deleted docs are per-segment, if so this should be fine. This won't work if you've recently optimized. when you're done you should have two cores out there (hmmm, these could also be treated as shards?) that you point your solr at. You might want to optimize in this case when you're done. I suspect you could, with a magnetized needle and a steady hand, edit some of the auxiliary files (segments*) but I would feel more secure letting CheckIndex to the heavy lifting. Here's another possibility Try a delete-by-query from a bit before the date you think things went over 2B to now (really hope you have a date!) perhaps you can walk the underlying index in Lucene somehow and make this work if you don't have a date. Since the underlying Lucene IDs are segment_base + local_segment_count this should be safely under 2B but I'm reaching here into areas I don't know much about. optimize (and wait. probably a really long time). re-index everything after the date (or whatever) you used above into a new shard now treat the big index just as you were talking about. Please understand that the over 2B docs might cause some grief here, but since the underlying index is segment based (i.e. the internal Lucene doc IDs are a base+offset for each segment), this has a decent chance of working (but anyone who really understands, please chime in. I'm reaching). Oh, and if it works, please let us know... Best Erick On Wed, Jun 20, 2012 at 6:37 PM, avenka ave...@gmail.com wrote: Erick, thanks for the advice, but let me make sure you haven't misunderstood what I was asking. I am not trying to split the huge existing index in install1 into shards. I am also not trying to make the huge install1 index as one shard of a sharded solr setup. I plan to use a sharded setup only for future docs. I do want to avoid trying to re-index the docs in install1 and think of them as a slow tape archive index server if I ever need to go and query the past documents. So I was wondering if I could somehow use the existing segment files to run an isolated (unsharded) solr server that lets me query roughly the first 2B docs before the wraparound problem happened. If the negative internal doc IDs have pervasively corrupted the segment files, this would not be possible, but I am not able to imagine an underlying lucene design that would cause such a problem. Is my only option to re-index the past 2B docs if I want to be able to query them at this point or is there any way to use the existing segment files? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990615.html Sent from the Solr - User mailing list archive at Nabble.com.
how to import product of entity date with DIH
i need to import data from sql server and cassandra first, get user ids from sql server then get one user's characters from cassandra by user id last, save the users characters doc into solr one user have multi charcters and i need to save the doc like 1040.txt row uniqueid=10031016578048 passportid=1040 character=Ranea/ row uniqueid=10031016578049 passportid=1040 character=assinissa/ row uniqueid=1005101793120 passportid=1040 character=AmmSmashYous/ row uniqueid=1005101793121 passportid=1040 character=Dangless/ row uniqueid=1007102768032 passportid=1040 character=sees/ row uniqueid=10131031905 passportid=1040 character=Loopz/ row uniqueid=10131031907 passportid=1040 character=MyLongName/ row uniqueid=10141031680 passportid=1040 character=Rawr/ row uniqueid=10261043118 passportid=1040 character=Firebald/ row uniqueid=10191054480 passportid=1040 character=salt/ 19734880.txt row uniqueid=10091011208112 passportid=19734880 character=3Negreteo / row uniqueid=10091011208113 passportid=19734880 character=3BlaKin / field uniqueid is character id field passportid is user id field character is character name i write the DIH config like the following dataConfig dataSource name=mssql driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://localhost:1433;databaseName=pw_account; / dataSource name=url type=URLDataSource / dataSource name=reader type=FieldReaderDataSource / document entity name=user dataSource=mssql query=SELECT top 100 id FROM pw_account..account entity name=line dataSource=url processor=LineEntityProcessor url=http://localhost/${user.id}.txt; format=text encoding=UTF-8 connectionTimeout=5000 readTimeout=10 entity name=xml dataSource=reader processor=XPathEntityProcessor dataField=line.rawLine forEach=/row rootEntity=false field column=id name=id xpath=/row/@uniqueid / field column=character name=character xpath=/row/@character / field column=passportid name=passportid xpath=/row/@passportid / /entity /entity /entity /document /dataConfig but it doesn't work i got the result in csv format like this id,passportid,character 1,1040,Ranea 2,, 3,, 4,, only user_1040 's first character's is imported, and the imported ids are all wrong how to write the correct DIH config? regards -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-import-product-of-entity-date-with-DIH-tp3990694.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with Tomcat on VPS
Thanks for clarifying! I have the solr/home configured in the server.xml of Tomcat. When I had not set it properly there were errors in the log. It is configured correctly now as no errors are in the log regarding sorl/home. The issue is that I cannot access any of the Servlets. I can access any of the JSP's by name. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990712.html Sent from the Solr - User mailing list archive at Nabble.com.
LeaderElection
Hi Messing with behaviour when Solr looses its ZooKeeper connection I'm trying to reproduce how a replica slice gets leader. I have made the below unit test in the LeaderElectionTest class which fails. I don't know if this simulates how Solr uses the LeaderElection class but please comment on the scenario. Thanks in advance. Best regards Trym @Test public void testMemoryElection() throws Exception { LeaderElector first = new LeaderElector(zkClient); ZkNodeProps props = new ZkNodeProps(ZkStateReader.BASE_URL_PROP, http://127.0.0.1/solr/;, ZkStateReader.CORE_NAME_PROP, 1); ElectionContext firstContext = new ShardLeaderElectionContextBase(first, slice1, collection2, dummynode1, props, zkStateReader); first.setup(firstContext); first.joinElection(firstContext); Thread.sleep(1000); assertEquals(original leader was not registered, http://127.0.0.1/solr/1/;, getLeaderUrl(collection2, slice1)); SolrZkClient zkClient2 = new SolrZkClient(server.getZkAddress(), TIMEOUT); LeaderElector second = new LeaderElector(zkClient2); props = new ZkNodeProps(ZkStateReader.BASE_URL_PROP, http://127.0.0.1/solr/;, ZkStateReader.CORE_NAME_PROP, 2); ElectionContext context = new ShardLeaderElectionContextBase(second, slice1, collection2, dummynode1, props, zkStateReader); second.setup(context); second.joinElection(context); Thread.sleep(1000); assertEquals(original leader should have stayed leader, http://127.0.0.1/solr/1/;, getLeaderUrl(zkClient2, collection2, slice1)); server.expire(zkClient.getSolrZooKeeper().getSessionId()); assertEquals(new leader was not registered, http://127.0.0.1/solr/2/;, getLeaderUrl(zkClient2, collection2, slice1)); }
Re: Editing solr update handler sub class
H. I think you would have a _far_ easier time of this just getting all the source code, modifying the relevant source, and then just issuing an ant dist (or ant example if you wanted to try it). There are other targets that will package up the whole thing just like you would get it from the website. And consider making a plugin rather than modifying DirectUpdateHandler2. Your custom update handler can _inherit_ from that class and do your special stuff. It's still easier, IMO, if you get the complete source. Executing ant dist will put all the files in the dist folder as Irshad says. You need an svn client (although I think there are Git repos out there too), ant and Ivy (although if you don't have the Ivy stuff, you will be guided through it's installation when you try the ant command). See: http://wiki.apache.org/solr/HowToContribute for how to get the source and make a build. Best Erick On Thu, Jun 21, 2012 at 1:36 AM, irshad siddiqui irshad.s...@gmail.com wrote: Hi, Jar file are located in dist folder . check ur dist folder or you can check your solrconfig.xml file where you will get jar location path. On Thu, Jun 21, 2012 at 9:47 AM, Shameema Umer shem...@gmail.com wrote: Can anybody tell me where are the lucene jar files org.apache.lucene.index and org.apache.lucene.search located? Thanks Shameema On Wed, Jun 20, 2012 at 4:44 PM, Shameema Umer shem...@gmail.com wrote: Hi, I decompiled DirectUpdateHandler2.class to .java file and edited it to suit my requirement to stop overwriting duplicates(I needed the first fetched tstamp). But when I tried to compile it to .class file, it shows 91 errors. Am I wrong anywhere? I am new to java application but fluent in web languages. Please help. Thanks Shameema
Re: write.lock
Hi, We are running exactly same solr version and have these issues relatively frequently. The main cause in our case has usually been the out of memory exceptions, as some of our shards are pretty fat. Allocating more RAM usually helps for a while. The lock file needs to be manually removed still, unfortunately. There are also sometimes commit collisions, and we get max warming searchers exceeded exceptions, but haven't yet figured out, if that may cause the locking as well. -- Dmitry On Wed, Jun 20, 2012 at 7:45 PM, Christopher Gross cogr...@gmail.comwrote: I'm running Solr 3.4. The past 2 months I've been getting a lot of write.lock errors. I switched to the simple lockType (and made it clear the lock on restart), but my index is still locking up a few times a week. I can't seem to determine what is causing the locks -- does anyone out there have any ideas/experience as to what is causing the locks, and what config changes that I can make in order to prevent the lock? Any help would be very appreciated! -- Chris -- Regards, Dmitry Kan
Re: solrj and replication
ok tested it myself and a slave runnning embedded works, just not within my application -- yet... On 20.06.2012 18:14, tom wrote: hi, i was just wondering if i need to do smth special if i want to have an embedded slave to get replication working ? my setup is like so: - in my clustered application that uses embedded solr(j) (for performance). the cores are configured as slaves that should connect to a master which runs in a jetty. - the embedded codes dont expose any of the solr servlets note: that the slave config, if started in jetty, does proper replication, while when embedded it doesnt. using solr 3.5 thx tom
Re: parameters to decide solr memory consumption
No, that's 255 bytes/record. Also, any time you store a field, the raw data is preserved in the *.fdt and *.fdx files. If you're thinking about RAM requirements, you must subtract the amount of data in those files from the total, as a start. This might help: http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/fileformats.html Best Erick On Thu, Jun 21, 2012 at 1:48 AM, Sachin Aggarwal different.sac...@gmail.com wrote: thanks for help hey I tried some exercise I m storing schema (uuid,key, userlocation) uuid and key are unique and user location have cardinality as 150 uuid and key are stored and indexed while userlocation is indexed not stored. still the index directory size is 51 MB just for 200,000 records don't u think its not optimal what if i go for billions of records. -- Thanks Regards Sachin Aggarwal 7760502772
suggester/autocomplete locks file preventing replication
hi, i'm using the suggester with a file like so: searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str !-- Alternatives to lookupImpl: org.apache.solr.spelling.suggest.fst.FSTLookup [finite state automaton] org.apache.solr.spelling.suggest.jaspell.JaspellLookup [default, jaspell-based] org.apache.solr.spelling.suggest.tst.TSTLookup [ternary trees] -- !-- the indexed field to derive suggestions from -- !-- TODO must change this to spell or smth alike later -- str name=fieldcontent/str float name=threshold0.05/float str name=buildOnCommittrue/str str name=weightBuckets100/str str name=sourceLocationautocomplete.dictionary/str /lst /searchComponent when trying to replicate i get the following error message on the slave side: 2012-06-21 14:34:50,781 ERROR [pool-3-thread-1 ] handler.ReplicationHandler- SnapPull failed org.apache.solr.common.SolrException: Unable to rename: path autocomplete.dictionary.20120620120611 at org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642) at org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) so i dug around it and found out that the solr's java process holds a lock on the autocomplete.dictionary file. any reason why this is so? thx, running: solr 3.5 win7
Re: suggester/autocomplete locks file preventing replication
BTW: a core unload doesnt release the lock either ;( On 21.06.2012 14:39, tom wrote: hi, i'm using the suggester with a file like so: searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str !-- Alternatives to lookupImpl: org.apache.solr.spelling.suggest.fst.FSTLookup [finite state automaton] org.apache.solr.spelling.suggest.jaspell.JaspellLookup [default, jaspell-based] org.apache.solr.spelling.suggest.tst.TSTLookup [ternary trees] -- !-- the indexed field to derive suggestions from -- !-- TODO must change this to spell or smth alike later -- str name=fieldcontent/str float name=threshold0.05/float str name=buildOnCommittrue/str str name=weightBuckets100/str str name=sourceLocationautocomplete.dictionary/str /lst /searchComponent when trying to replicate i get the following error message on the slave side: 2012-06-21 14:34:50,781 ERROR [pool-3-thread-1 ] handler.ReplicationHandler- SnapPull failed org.apache.solr.common.SolrException: Unable to rename: path autocomplete.dictionary.20120620120611 at org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642) at org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) so i dug around it and found out that the solr's java process holds a lock on the autocomplete.dictionary file. any reason why this is so? thx, running: solr 3.5 win7
Re: suggester/autocomplete locks file preventing replication
pocking into the code i think the FileDictionary class is the culprit: It takes an InputStream as a ctor argument but never releases the stream. what puzzles me is that the class seems to allow a one-time iteration and then the stream is useless, unless i'm missing smth. here. is there a good reason for this or rather a bug? should i move the topic to the dev list? On 21.06.2012 14:49, tom wrote: BTW: a core unload doesnt release the lock either ;( On 21.06.2012 14:39, tom wrote: hi, i'm using the suggester with a file like so: searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str !-- Alternatives to lookupImpl: org.apache.solr.spelling.suggest.fst.FSTLookup [finite state automaton] org.apache.solr.spelling.suggest.jaspell.JaspellLookup [default, jaspell-based] org.apache.solr.spelling.suggest.tst.TSTLookup [ternary trees] -- !-- the indexed field to derive suggestions from -- !-- TODO must change this to spell or smth alike later -- str name=fieldcontent/str float name=threshold0.05/float str name=buildOnCommittrue/str str name=weightBuckets100/str str name=sourceLocationautocomplete.dictionary/str /lst /searchComponent when trying to replicate i get the following error message on the slave side: 2012-06-21 14:34:50,781 ERROR [pool-3-thread-1 ] handler.ReplicationHandler- SnapPull failed org.apache.solr.common.SolrException: Unable to rename: path autocomplete.dictionary.20120620120611 at org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642) at org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526) at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299) at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268) at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) so i dug around it and found out that the solr's java process holds a lock on the autocomplete.dictionary file. any reason why this is so? thx, running: solr 3.5 win7
Re: Commit when a segment is written
I don't think autocommit is deprecated, it's just commented out of the config and using commitWithin (assuming you're working from SolrJ) is preferred if possible. But what governs a particular set of docs? What are the criteria that determine when you want to commit? Flushes and commits are orthogonal. A segment is kept open through multiple flushes. That is, there can be many flushes and the documents still aren't searchable until the first commit (but it sounds like you're aware of that). Have you tried using autocommit? And what version of Solr are you using? And finally, what is your use case for frequent commits? If you're going after NRT functionality, have you looked at the NRT stuff in 4.x? Best Erick On Thu, Jun 21, 2012 at 8:01 AM, Ramprakash Ramamoorthy youngestachie...@gmail.com wrote: Dear, I am using Lucene/Solr for my log search tool. Is there a way I can perform a commit operation on my IndexWriter when a particular set of docs is flushed from memory to the disk. My RamBufferSize is 24Mb and MergeFactor is 10. Or is calling commit in frequent intervals irrespective of the flushes the only way? I wish the autocommit feature was not deprecated. -- With Thanks and Regards, Ramprakash Ramamoorthy, Engineer Trainee, Zoho Corporation. +91 9626975420
Solr 4.0 with Near Real Time and Faceted Search in Replicated topology
Hi all, We're thinking of moving forward with Solr 4.0 and we plan to have a master index server and at least two slaves servers. The Master server will be used primarily for indexing and the queries will be load balanced across to the replicated slave servers. I would like to know if, with the current support for Near Real Time search in 4.0, there's support for Faceted Search. Keeping in mind that the searches will be performed against the Slave servers and not the Master (indexing) server. If it's not supported, will we need to use SolrCloud to gain the benefits of Near Real Time search when performing Faceted Searches? Any insight would be greatly appreciated. Thanks all!
Re: solr java.lang.NullPointerException on select queries
Erick, much thanks for detailing these options. I am currently trying the second one as that seems a little easier and quicker to me. I successfully deleted documents with IDs after the problem time that I do know to an accuracy of a couple hours. Now, the stats are: numDocs : 2132454075 maxDoc : -2130733352 The former is nicely below 2^31. But I can't seem to get the latter to decrease and become positive by deleting further. Should I just run an optimize at this point? I have never manually run an optimize and plan to just hit http://machine_name/solr/update?optimize=true Can you confirm this? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990798.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Exception using distributed field-collapsing
Does it work in the non distributed case? Is the field you're grouping on stored? What is the type on the uniqueKey field? Is it stored and indexed? I've had a problem with distributed not working when the uniqueKey field was indexed but not stored. Also, in distributed searches, the uniqueKey is used to retrieve documents from shards, so if were say, a date, that may be causing the issue. Cody -Original Message- From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] Sent: Wednesday, June 20, 2012 1:54 PM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing Hi Bryan, What is the fieldtype of the groupField? You can only group by field that is of type string as is described in the wiki: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters When you group by another field type a http 400 should be returned instead if this error. At least that what I'd expect. Martijn Martijn, The group-by field is a string. I have been unable to figure how a date comes into the picture at all, and have basically been wondering if there is some problem in the grouping code that misaligns the field values from different results in the group, so that it is not comparing like with like. Not a strong theory, just the only thing I can think of. -- Bryan
RE: Exception using distributed field-collapsing
Cody, Does it work in the non distributed case? Yes. Is the field you're grouping on stored? What is the type on the uniqueKey field? Is it stored and indexed? The field I'm grouping on is a string, stored and indexed. The unique key field is a string, stored and indexed. I've had a problem with distributed not working when the uniqueKey field was indexed but not stored. Was it the same exception I'm seeing? -- Bryan -Original Message- From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] Sent: Wednesday, June 20, 2012 1:54 PM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing Hi Bryan, What is the fieldtype of the groupField? You can only group by field that is of type string as is described in the wiki: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters When you group by another field type a http 400 should be returned instead if this error. At least that what I'd expect. Martijn Martijn, The group-by field is a string. I have been unable to figure how a date comes into the picture at all, and have basically been wondering if there is some problem in the grouping code that misaligns the field values from different results in the group, so that it is not comparing like with like. Not a strong theory, just the only thing I can think of. -- Bryan
RE: Exception using distributed field-collapsing
No, I believe it was a different exception, just brainstorming. (it was a null reference iirc) Does a *:* query with no sorting work? Cody -Original Message- From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] Sent: Thursday, June 21, 2012 10:33 AM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing Cody, Does it work in the non distributed case? Yes. Is the field you're grouping on stored? What is the type on the uniqueKey field? Is it stored and indexed? The field I'm grouping on is a string, stored and indexed. The unique key field is a string, stored and indexed. I've had a problem with distributed not working when the uniqueKey field was indexed but not stored. Was it the same exception I'm seeing? -- Bryan -Original Message- From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] Sent: Wednesday, June 20, 2012 1:54 PM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing Hi Bryan, What is the fieldtype of the groupField? You can only group by field that is of type string as is described in the wiki: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters When you group by another field type a http 400 should be returned instead if this error. At least that what I'd expect. Martijn Martijn, The group-by field is a string. I have been unable to figure how a date comes into the picture at all, and have basically been wondering if there is some problem in the grouping code that misaligns the field values from different results in the group, so that it is not comparing like with like. Not a strong theory, just the only thing I can think of. -- Bryan
Re: solr java.lang.NullPointerException on select queries
Right, if you optimize, at the end maxDocs should == numDocs. Usually the document reclamation stuff is done when segments merge, but that won't happen in this case since this index is becoming static, so a manual optimize is probably indicated. Something like this should also work, either way: http://localhost:8983/solr/update?stream.body=optimize/ But be prepared to wait for a very long time. I'd copy it somewhere else first just for safety's sake Best Erick On Thu, Jun 21, 2012 at 12:52 PM, avenka ave...@gmail.com wrote: Erick, much thanks for detailing these options. I am currently trying the second one as that seems a little easier and quicker to me. I successfully deleted documents with IDs after the problem time that I do know to an accuracy of a couple hours. Now, the stats are: numDocs : 2132454075 maxDoc : -2130733352 The former is nicely below 2^31. But I can't seem to get the latter to decrease and become positive by deleting further. Should I just run an optimize at this point? I have never manually run an optimize and plan to just hit http://machine_name/solr/update?optimize=true Can you confirm this? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990798.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Exception using distributed field-collapsing
Does a *:* query with no sorting work? Well, this is interesting. Leaving q= as it was, but removing the sort, makes the whole thing work. And if you were thinking of asking whether the sort field is a date, the answer is yes, it's an indexed and stored DateField. It's also on the list of fields whose values I am requesting with fl=. So I guess this is likely to be the date that is somehow turning up in the ClassCastException. Great suggestion. Thanks, Cody. Now I'm wondering if anyone familiar with the Field Collapsing code can see a possible vector for a bug, given this fleshing out of the bug conditions. -- Bryan -Original Message- From: Young, Cody [mailto:cody.yo...@move.com] Sent: Thursday, June 21, 2012 11:04 AM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing No, I believe it was a different exception, just brainstorming. (it was a null reference iirc) Does a *:* query with no sorting work? Cody -Original Message- From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] Sent: Thursday, June 21, 2012 10:33 AM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing Cody, Does it work in the non distributed case? Yes. Is the field you're grouping on stored? What is the type on the uniqueKey field? Is it stored and indexed? The field I'm grouping on is a string, stored and indexed. The unique key field is a string, stored and indexed. I've had a problem with distributed not working when the uniqueKey field was indexed but not stored. Was it the same exception I'm seeing? -- Bryan -Original Message- From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] Sent: Wednesday, June 20, 2012 1:54 PM To: solr-user@lucene.apache.org Subject: RE: Exception using distributed field-collapsing Hi Bryan, What is the fieldtype of the groupField? You can only group by field that is of type string as is described in the wiki: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters When you group by another field type a http 400 should be returned instead if this error. At least that what I'd expect. Martijn Martijn, The group-by field is a string. I have been unable to figure how a date comes into the picture at all, and have basically been wondering if there is some problem in the grouping code that misaligns the field values from different results in the group, so that it is not comparing like with like. Not a strong theory, just the only thing I can think of. -- Bryan