Re: solr limits

2012-06-21 Thread Sachin Aggarwal
hello,

plz clarify documents means unique id's or something else

lets say i have file indexed each file no. is unique so file count will b
2.14 billions
assume i have content in database as records each record have unique id so
record count will be 2.14 billions

m i right?



-- 

Thanks  Regards

Sachin Aggarwal
7760502772


Re: Apache Lucene Eurocon 2012

2012-06-21 Thread Mikhail Khludnev
Ok.  Do you know when and where Lucene Eurocon 2012 gonna happen?

On Wed, Jun 20, 2012 at 10:16 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 up

 --
 Sincerely yours
 Mikhail Khludnev
 Tech Lead
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com




-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: solr limits

2012-06-21 Thread irshad siddiqui
Hi,

One index records is one documents along with one unique id. like in
database one rows is one document is solr.





On Thu, Jun 21, 2012 at 11:39 AM, Sachin Aggarwal 
different.sac...@gmail.com wrote:

 hello,

 plz clarify documents means unique id's or something else

 lets say i have file indexed each file no. is unique so file count will b
 2.14 billions
 assume i have content in database as records each record have unique id so
 record count will be 2.14 billions

 m i right?



 --

 Thanks  Regards

 Sachin Aggarwal
 7760502772



Re: solr limits

2012-06-21 Thread Sachin Aggarwal
thanks ..

On Thu, Jun 21, 2012 at 11:51 AM, irshad siddiqui irshad.s...@gmail.comwrote:

 Hi,

 One index records is one documents along with one unique id. like in
 database one rows is one document is solr.





 On Thu, Jun 21, 2012 at 11:39 AM, Sachin Aggarwal 
 different.sac...@gmail.com wrote:

  hello,
 
  plz clarify documents means unique id's or something else
 
  lets say i have file indexed each file no. is unique so file count will b
  2.14 billions
  assume i have content in database as records each record have unique id
 so
  record count will be 2.14 billions
 
  m i right?
 
 
 
  --
 
  Thanks  Regards
 
  Sachin Aggarwal
  7760502772
 




-- 

Thanks  Regards

Sachin Aggarwal
7760502772


Re: Solr with Tomcat on VPS

2012-06-21 Thread mcfly04
Can anyone help?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990677.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr with Tomcat on VPS

2012-06-21 Thread irshad siddiqui
Hi,

You have to check installation configuration.
Inside  tomcat webapps folder  you have putted solr.war file. and within
this WEB-INF folder web.xml you have to check your solr core folder url






On Thu, Jun 21, 2012 at 3:34 PM, mcfly04 hil...@csc-scc.gc.ca wrote:

 Can anyone help?

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990677.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr with Tomcat on VPS

2012-06-21 Thread mcfly04
Thank you for your response.

Are you referring to the SolrRequestFilter path-prefix?
Here is a copy of my web.xml:

-

?xml version=1.0 encoding=UTF-8?
!DOCTYPE web-app PUBLIC quot;-//Sun Microsystems, Inc.//DTD Web
Application 2.3//ENquot;
quot;http://java.sun.com/dtd/web-app_2_3.dtdquot;


web-app

  

  
  
   
  
  filter
filter-nameSolrRequestFilter/filter-name
filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class


  /filter

  filter-mapping

filter-nameSolrRequestFilter/filter-name
url-pattern/*/url-pattern
  /filter-mapping

  
  
  servlet
servlet-nameSolrServer/servlet-name
display-nameSolr/display-name
descriptionSolr Server/description
servlet-classorg.apache.solr.servlet.SolrServlet/servlet-class
load-on-startup1/load-on-startup
  /servlet

  servlet
servlet-nameSolrUpdate/servlet-name
display-nameSolrUpdate/display-name
descriptionSolr Update Handler/description
servlet-classorg.apache.solr.servlet.SolrUpdateServlet/servlet-class
load-on-startup2/load-on-startup
  /servlet

  servlet
servlet-nameLogging/servlet-name
servlet-classorg.apache.solr.servlet.LogLevelSelection/servlet-class
  /servlet

  
  servlet
servlet-nameping/servlet-name
jsp-file/admin/ping.jsp/jsp-file
  /servlet

  servlet-mapping
servlet-nameSolrServer/servlet-name
url-pattern/select/*/url-pattern
  /servlet-mapping

  servlet-mapping
servlet-nameSolrUpdate/servlet-name
url-pattern/update/*/url-pattern
  /servlet-mapping

  servlet-mapping
servlet-nameLogging/servlet-name
url-pattern/admin/logging/url-pattern
  /servlet-mapping

  
  servlet-mapping
servlet-nameping/servlet-name
url-pattern/admin/ping/url-pattern
  /servlet-mapping

  
  servlet-mapping
servlet-nameLogging/servlet-name
url-pattern/admin/logging.jsp/url-pattern
  /servlet-mapping
  
  mime-mapping
extension.xsl/extension

mime-typeapplication/xslt+xml/mime-type
  /mime-mapping

  welcome-file-list
welcome-fileindex.jsp/welcome-file
welcome-fileindex.html/welcome-file
  /welcome-file-list

/web-app


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990687.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr with Tomcat on VPS

2012-06-21 Thread irshad siddiqui
HI,
in this web.xml  file you need to add below line


 env-entry
   env-entry-namesolr/home/env-entry-name
   env-entry-valueyour solr core path here /env-entry-value
   env-entry-typejava.lang.String/env-entry-type
/env-entry




On Thu, Jun 21, 2012 at 4:24 PM, mcfly04 hil...@csc-scc.gc.ca wrote:

 Thank you for your response.

 Are you referring to the SolrRequestFilter path-prefix?
 Here is a copy of my web.xml:

 -

 ?xml version=1.0 encoding=UTF-8?
 !DOCTYPE web-app PUBLIC quot;-//Sun Microsystems, Inc.//DTD Web
 Application 2.3//ENquot;
 quot;http://java.sun.com/dtd/web-app_2_3.dtdquot;


 web-app







  filter
filter-nameSolrRequestFilter/filter-name
filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class


  /filter

  filter-mapping

filter-nameSolrRequestFilter/filter-name
url-pattern/*/url-pattern
  /filter-mapping



  servlet
servlet-nameSolrServer/servlet-name
display-nameSolr/display-name
descriptionSolr Server/description
servlet-classorg.apache.solr.servlet.SolrServlet/servlet-class
load-on-startup1/load-on-startup
  /servlet

  servlet
servlet-nameSolrUpdate/servlet-name
display-nameSolrUpdate/display-name
descriptionSolr Update Handler/description
servlet-classorg.apache.solr.servlet.SolrUpdateServlet/servlet-class
load-on-startup2/load-on-startup
  /servlet

  servlet
servlet-nameLogging/servlet-name
servlet-classorg.apache.solr.servlet.LogLevelSelection/servlet-class
  /servlet


  servlet
servlet-nameping/servlet-name
jsp-file/admin/ping.jsp/jsp-file
  /servlet

  servlet-mapping
servlet-nameSolrServer/servlet-name
url-pattern/select/*/url-pattern
  /servlet-mapping

  servlet-mapping
servlet-nameSolrUpdate/servlet-name
url-pattern/update/*/url-pattern
  /servlet-mapping

  servlet-mapping
servlet-nameLogging/servlet-name
url-pattern/admin/logging/url-pattern
  /servlet-mapping


  servlet-mapping
servlet-nameping/servlet-name
url-pattern/admin/ping/url-pattern
  /servlet-mapping


  servlet-mapping
servlet-nameLogging/servlet-name
url-pattern/admin/logging.jsp/url-pattern
  /servlet-mapping

  mime-mapping
extension.xsl/extension

mime-typeapplication/xslt+xml/mime-type
  /mime-mapping

  welcome-file-list
welcome-fileindex.jsp/welcome-file
welcome-fileindex.html/welcome-file
  /welcome-file-list

 /web-app


 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990687.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr with Tomcat on VPS

2012-06-21 Thread irshad siddiqui
Hi,

you can also refer below url for solr configuration

http://tek-manthan.blogspot.in/

Regards,
Irshad



On Thu, Jun 21, 2012 at 4:24 PM, mcfly04 hil...@csc-scc.gc.ca wrote:

 Thank you for your response.

 Are you referring to the SolrRequestFilter path-prefix?
 Here is a copy of my web.xml:

 -

 ?xml version=1.0 encoding=UTF-8?
 !DOCTYPE web-app PUBLIC quot;-//Sun Microsystems, Inc.//DTD Web
 Application 2.3//ENquot;
 quot;http://java.sun.com/dtd/web-app_2_3.dtdquot;


 web-app







  filter
filter-nameSolrRequestFilter/filter-name
filter-classorg.apache.solr.servlet.SolrDispatchFilter/filter-class


  /filter

  filter-mapping

filter-nameSolrRequestFilter/filter-name
url-pattern/*/url-pattern
  /filter-mapping



  servlet
servlet-nameSolrServer/servlet-name
display-nameSolr/display-name
descriptionSolr Server/description
servlet-classorg.apache.solr.servlet.SolrServlet/servlet-class
load-on-startup1/load-on-startup
  /servlet

  servlet
servlet-nameSolrUpdate/servlet-name
display-nameSolrUpdate/display-name
descriptionSolr Update Handler/description
servlet-classorg.apache.solr.servlet.SolrUpdateServlet/servlet-class
load-on-startup2/load-on-startup
  /servlet

  servlet
servlet-nameLogging/servlet-name
servlet-classorg.apache.solr.servlet.LogLevelSelection/servlet-class
  /servlet


  servlet
servlet-nameping/servlet-name
jsp-file/admin/ping.jsp/jsp-file
  /servlet

  servlet-mapping
servlet-nameSolrServer/servlet-name
url-pattern/select/*/url-pattern
  /servlet-mapping

  servlet-mapping
servlet-nameSolrUpdate/servlet-name
url-pattern/update/*/url-pattern
  /servlet-mapping

  servlet-mapping
servlet-nameLogging/servlet-name
url-pattern/admin/logging/url-pattern
  /servlet-mapping


  servlet-mapping
servlet-nameping/servlet-name
url-pattern/admin/ping/url-pattern
  /servlet-mapping


  servlet-mapping
servlet-nameLogging/servlet-name
url-pattern/admin/logging.jsp/url-pattern
  /servlet-mapping

  mime-mapping
extension.xsl/extension

mime-typeapplication/xslt+xml/mime-type
  /mime-mapping

  welcome-file-list
welcome-fileindex.jsp/welcome-file
welcome-fileindex.html/welcome-file
  /welcome-file-list

 /web-app


 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990687.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: solr java.lang.NullPointerException on select queries

2012-06-21 Thread Erick Erickson
Ah, OK, I misunderstood. OK, here's a couple of off-the-top-of-my-head
ideas.

make a backup of your index before anything else G...

Split up your current index into two parts by segments. That is, copy the whole
directory to another place, and remove some of the segments from each. I.e.
when you're done, you'll still have all the segments you used to have,
but some of
them will be in one directory and some in another. Of course all of
the segments files
with a common prefix should be in place (e.g. all the _0.* files in
the same dir, not
split between the two dirs).

Now run CheckIndex on them. That'll take a long time, but it _should_
spoof Solr/Lucene
into thinking that there are two complete indexes out there. Now your
idea of having an archival
search should work, but with two places to look, not one. NOTE:
Whether this plays nice with
the over 2B docs or deleted documents I can't guarantee I believe
that the deleted docs
are per-segment, if so this should be fine. This won't work if you've recently
optimized. when you're done you should have two cores out there
(hmmm, these could
also be treated as shards?) that you point your solr at.

You might want to optimize in this case when you're done.
I suspect you could, with a magnetized needle and a steady hand, edit
some of the auxiliary files (segments*) but
I would feel more secure letting CheckIndex to the heavy lifting.

Here's another possibility
 Try a delete-by-query from a bit before the date you think things went over 
 2B to now (really hope you have a date!)
 perhaps you can walk the underlying index in Lucene somehow and
make this work if you don't have a date. Since the
   underlying Lucene IDs are segment_base +
local_segment_count this should be safely under 2B
   but I'm reaching here into areas I don't know much about.
 optimize (and wait. probably a really long time).
 re-index everything after the date (or whatever) you used above into a new 
 shard
 now treat the big index just as you were talking about.

Please understand that the over 2B docs might cause some grief here,
but since the underlying index
is segment based (i.e. the internal Lucene doc IDs are a base+offset
for each segment), this has a
decent chance of working (but anyone who really understands, please
chime in. I'm reaching).

Oh, and if it works, please let us know...

Best
Erick

On Wed, Jun 20, 2012 at 6:37 PM, avenka ave...@gmail.com wrote:
 Erick, thanks for the advice, but let me make sure you haven't misunderstood
 what I was asking.

 I am not trying to split the huge existing index in install1 into shards. I
 am also not trying to make the huge install1 index as one shard of a sharded
 solr setup. I plan to use a sharded setup only for future docs.

 I do want to avoid trying to re-index the docs in install1 and think of them
 as a slow tape archive index server if I ever need to go and query the
 past documents. So I was wondering if I could somehow use the existing
 segment files to run an isolated (unsharded) solr server that lets me query
 roughly the first 2B docs before the wraparound problem happened. If the
 negative internal doc IDs have pervasively corrupted the segment files,
 this would not be possible, but I am not able to imagine an underlying
 lucene design that would cause such a problem. Is my only option to re-index
 the past 2B docs if I want to be able to query them at this point or is
 there any way to use the existing segment files?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990615.html
 Sent from the Solr - User mailing list archive at Nabble.com.


how to import product of entity date with DIH

2012-06-21 Thread jueljust
i need to import data from sql server and cassandra
first, get user ids from sql server
then get one user's characters from cassandra by user id
last, save the users characters doc into solr
one user have multi charcters
and i need to save the doc like 

1040.txt
row uniqueid=10031016578048 passportid=1040 character=Ranea/
row uniqueid=10031016578049 passportid=1040 character=assinissa/
row uniqueid=1005101793120 passportid=1040 character=AmmSmashYous/
row uniqueid=1005101793121 passportid=1040 character=Dangless/
row uniqueid=1007102768032 passportid=1040 character=sees/
row uniqueid=10131031905 passportid=1040 character=Loopz/
row uniqueid=10131031907 passportid=1040 character=MyLongName/
row uniqueid=10141031680 passportid=1040 character=Rawr/
row uniqueid=10261043118 passportid=1040 character=Firebald/
row uniqueid=10191054480 passportid=1040 character=salt/

19734880.txt
row uniqueid=10091011208112 passportid=19734880 character=3Negreteo
/
row uniqueid=10091011208113 passportid=19734880 character=3BlaKin /

field uniqueid is  character id
field passportid is user id
field character is character name

i write the DIH config like the following

dataConfig
dataSource name=mssql
driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
url=jdbc:sqlserver://localhost:1433;databaseName=pw_account; 
/
dataSource name=url type=URLDataSource /
dataSource name=reader type=FieldReaderDataSource /
document
entity name=user dataSource=mssql
query=SELECT top 100 id FROM pw_account..account
entity name=line dataSource=url
processor=LineEntityProcessor
url=http://localhost/${user.id}.txt;
format=text encoding=UTF-8 connectionTimeout=5000
readTimeout=10
entity name=xml dataSource=reader
processor=XPathEntityProcessor 
dataField=line.rawLine forEach=/row
rootEntity=false
field column=id name=id xpath=/row/@uniqueid /
field column=character name=character
xpath=/row/@character /
field column=passportid name=passportid
xpath=/row/@passportid /
/entity
/entity
/entity
/document
/dataConfig


but it doesn't work

i got the result in csv format like this

id,passportid,character
1,1040,Ranea
2,,
3,,
4,,

only user_1040 's first character's is imported, and the imported ids are
all wrong

how to write the correct DIH config?

regards


--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-import-product-of-entity-date-with-DIH-tp3990694.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr with Tomcat on VPS

2012-06-21 Thread mcfly04
Thanks for clarifying!

I have the solr/home configured in the server.xml of Tomcat.  When I had not
set it properly there were errors in the log.  It is configured correctly
now as no errors are in the log regarding sorl/home.

The issue is that I cannot access any of the Servlets.  I can access any of
the JSP's by name.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-with-Tomcat-on-VPS-tp3990397p3990712.html
Sent from the Solr - User mailing list archive at Nabble.com.


LeaderElection

2012-06-21 Thread Trym R. Møller

Hi

Messing with behaviour when Solr looses its ZooKeeper connection I'm 
trying to reproduce how a replica slice gets leader. I have made the 
below unit test in the LeaderElectionTest class which fails.
I don't know if this simulates how Solr uses the LeaderElection class 
but please comment on the scenario.


Thanks in advance.

Best regards Trym

  @Test
  public void testMemoryElection() throws Exception {
LeaderElector first = new LeaderElector(zkClient);
ZkNodeProps props = new ZkNodeProps(ZkStateReader.BASE_URL_PROP,
http://127.0.0.1/solr/;, ZkStateReader.CORE_NAME_PROP, 1);
ElectionContext firstContext = new 
ShardLeaderElectionContextBase(first,

slice1, collection2, dummynode1, props, zkStateReader);
first.setup(firstContext);
first.joinElection(firstContext);

Thread.sleep(1000);
assertEquals(original leader was not registered, 
http://127.0.0.1/solr/1/;, getLeaderUrl(collection2, slice1));


SolrZkClient zkClient2 = new SolrZkClient(server.getZkAddress(), 
TIMEOUT);

LeaderElector second = new LeaderElector(zkClient2);
props = new ZkNodeProps(ZkStateReader.BASE_URL_PROP,
http://127.0.0.1/solr/;, ZkStateReader.CORE_NAME_PROP, 2);
ElectionContext context = new ShardLeaderElectionContextBase(second,
slice1, collection2, dummynode1, props, zkStateReader);
second.setup(context);
second.joinElection(context);
Thread.sleep(1000);
assertEquals(original leader should have stayed leader, 
http://127.0.0.1/solr/1/;, getLeaderUrl(zkClient2, collection2, 
slice1));


server.expire(zkClient.getSolrZooKeeper().getSessionId());

assertEquals(new leader was not registered, 
http://127.0.0.1/solr/2/;, getLeaderUrl(zkClient2, collection2, 
slice1));

  }



Re: Editing solr update handler sub class

2012-06-21 Thread Erick Erickson
H. I think you would have a _far_ easier time of this just getting
all the source code, modifying the relevant source, and then just
issuing an ant dist (or ant example if you wanted to try it). There are
other targets that will package up the whole thing just like you would get
it from the website.

And consider making a plugin rather than modifying DirectUpdateHandler2.
Your custom update handler can _inherit_ from that class and do your special
stuff. It's still easier, IMO, if you get the complete source. Executing
ant dist will put all the files in the dist folder as Irshad says.

You need an svn client (although I think there are Git repos out there too), ant
and Ivy (although if you don't have the Ivy stuff, you will be guided through
it's installation when you try the ant command).

See: http://wiki.apache.org/solr/HowToContribute for how to get the source
and make a build.

Best
Erick

On Thu, Jun 21, 2012 at 1:36 AM, irshad siddiqui irshad.s...@gmail.com wrote:
  Hi,

 Jar file are  located in dist folder . check ur dist folder or you can
 check your solrconfig.xml  file where you will get jar location path.


 On Thu, Jun 21, 2012 at 9:47 AM, Shameema Umer shem...@gmail.com wrote:

 Can anybody tell me where are the lucene jar files
 org.apache.lucene.index and org.apache.lucene.search located?

 Thanks
 Shameema

 On Wed, Jun 20, 2012 at 4:44 PM, Shameema Umer shem...@gmail.com wrote:
  Hi,
 
  I decompiled DirectUpdateHandler2.class to .java file and edited it to
  suit my requirement to stop overwriting duplicates(I needed the first
  fetched tstamp).
  But when I tried to compile it to .class file, it shows 91 errors. Am
  I wrong anywhere?
 
  I am new to java application but fluent in web languages.
 
  Please help.
 
  Thanks
  Shameema



Re: write.lock

2012-06-21 Thread Dmitry Kan
Hi,

We are running exactly same solr version and have these issues relatively
frequently. The main cause in our case has usually been the out of memory
exceptions, as some of our shards are pretty fat. Allocating more RAM
usually helps for a while. The lock file needs to be manually removed
still, unfortunately.

There are also sometimes commit collisions, and we get max warming
searchers exceeded exceptions, but haven't yet figured out, if that may
cause the locking as well.

-- Dmitry

On Wed, Jun 20, 2012 at 7:45 PM, Christopher Gross cogr...@gmail.comwrote:

 I'm running Solr 3.4.  The past 2 months I've been getting a lot of
 write.lock errors.  I switched to the simple lockType (and made it
 clear the lock on restart), but my index is still locking up a few
 times a week.

 I can't seem to determine what is causing the locks -- does anyone out
 there have any ideas/experience as to what is causing the locks, and
 what config changes that I can make in order to prevent the lock?

 Any help would be very appreciated!

 -- Chris




-- 
Regards,

Dmitry Kan


Re: solrj and replication

2012-06-21 Thread tom
ok tested it myself and a slave runnning embedded works, just not within 
my application -- yet...


On 20.06.2012 18:14, tom wrote:

hi,

i was just wondering if i need to do smth special if i want to have an 
embedded slave to get replication working ?


my setup is like so:
- in my clustered application that uses embedded solr(j) (for 
performance). the cores are configured as slaves that should connect 
to a master which runs in a jetty.

- the embedded codes dont expose any of the solr servlets

note: that the slave config, if started in jetty, does proper 
replication, while when embedded it doesnt.


using solr 3.5

thx

tom






Re: parameters to decide solr memory consumption

2012-06-21 Thread Erick Erickson
No, that's 255 bytes/record. Also, any time you store a field, the
raw data is preserved in the *.fdt and *.fdx files. If you're thinking
about RAM requirements, you must subtract the amount of data
in those files from the total, as a start. This might help:

http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/fileformats.html

Best
Erick

On Thu, Jun 21, 2012 at 1:48 AM, Sachin Aggarwal
different.sac...@gmail.com wrote:
 thanks for help


 hey
 I tried some exercise
 I m storing schema (uuid,key, userlocation)
 uuid and key are unique and user location have cardinality as 150
 uuid and key are stored and indexed while userlocation is indexed not
 stored.
 still the index directory size is 51 MB just for 200,000 records don't u
 think its not optimal
 what if i go for billions of records.

 --

 Thanks  Regards

 Sachin Aggarwal
 7760502772


suggester/autocomplete locks file preventing replication

2012-06-21 Thread tom

hi,

i'm using the suggester with a file like so:

  searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str

  !-- Alternatives to lookupImpl:
org.apache.solr.spelling.suggest.fst.FSTLookup [finite state 
automaton]
org.apache.solr.spelling.suggest.jaspell.JaspellLookup 
[default, jaspell-based]

org.apache.solr.spelling.suggest.tst.TSTLookup [ternary trees]
  --
  !-- the indexed field to derive suggestions from --
  !-- TODO must change this to spell or smth alike later --
  str name=fieldcontent/str
  float name=threshold0.05/float
  str name=buildOnCommittrue/str
  str name=weightBuckets100/str
  str name=sourceLocationautocomplete.dictionary/str
/lst
  /searchComponent

when trying to replicate i get the following error message on the slave 
side:


 2012-06-21 14:34:50,781 ERROR 
[pool-3-thread-1  ] 
handler.ReplicationHandler- SnapPull failed
org.apache.solr.common.SolrException: Unable to rename: path  
autocomplete.dictionary.20120620120611
at 
org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
at 
org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)

at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

at java.lang.Thread.run(Thread.java:619)

so i dug around it and found out that the solr's java process holds a 
lock on the autocomplete.dictionary file. any reason why this is so?


thx,

running:
solr 3.5
win7


Re: suggester/autocomplete locks file preventing replication

2012-06-21 Thread tom

BTW: a core unload doesnt release the lock either ;(


On 21.06.2012 14:39, tom wrote:

hi,

i'm using the suggester with a file like so:

  searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str

  !-- Alternatives to lookupImpl:
org.apache.solr.spelling.suggest.fst.FSTLookup [finite state 
automaton]
org.apache.solr.spelling.suggest.jaspell.JaspellLookup 
[default, jaspell-based]

org.apache.solr.spelling.suggest.tst.TSTLookup [ternary trees]
  --
  !-- the indexed field to derive suggestions from --
  !-- TODO must change this to spell or smth alike later --
  str name=fieldcontent/str
  float name=threshold0.05/float
  str name=buildOnCommittrue/str
  str name=weightBuckets100/str
  str name=sourceLocationautocomplete.dictionary/str
/lst
  /searchComponent

when trying to replicate i get the following error message on the 
slave side:


 2012-06-21 14:34:50,781 ERROR 
[pool-3-thread-1  ] 
handler.ReplicationHandler- SnapPull failed
org.apache.solr.common.SolrException: Unable to rename: path  
autocomplete.dictionary.20120620120611
at 
org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
at 
org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)

at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

at java.lang.Thread.run(Thread.java:619)

so i dug around it and found out that the solr's java process holds a 
lock on the autocomplete.dictionary file. any reason why this is so?


thx,

running:
solr 3.5
win7






Re: suggester/autocomplete locks file preventing replication

2012-06-21 Thread tom

pocking into the code i think the FileDictionary class is the culprit:
It takes an InputStream as a ctor argument but never releases the 
stream. what puzzles me is that the class seems to allow a one-time 
iteration and then the stream is useless, unless i'm missing smth. here.


is there a good reason for this or rather a bug?
should i move the topic to the dev list?


On 21.06.2012 14:49, tom wrote:

BTW: a core unload doesnt release the lock either ;(


On 21.06.2012 14:39, tom wrote:

hi,

i'm using the suggester with a file like so:

  searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.FSTLookup/str

  !-- Alternatives to lookupImpl:
org.apache.solr.spelling.suggest.fst.FSTLookup [finite state 
automaton]
org.apache.solr.spelling.suggest.jaspell.JaspellLookup 
[default, jaspell-based]

org.apache.solr.spelling.suggest.tst.TSTLookup [ternary trees]
  --
  !-- the indexed field to derive suggestions from --
  !-- TODO must change this to spell or smth alike later --
  str name=fieldcontent/str
  float name=threshold0.05/float
  str name=buildOnCommittrue/str
  str name=weightBuckets100/str
  str name=sourceLocationautocomplete.dictionary/str
/lst
  /searchComponent

when trying to replicate i get the following error message on the 
slave side:


 2012-06-21 14:34:50,781 ERROR 
[pool-3-thread-1  ] 
handler.ReplicationHandler- SnapPull failed
org.apache.solr.common.SolrException: Unable to rename: path  
autocomplete.dictionary.20120620120611
at 
org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
at 
org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)

at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

at java.lang.Thread.run(Thread.java:619)

so i dug around it and found out that the solr's java process holds a 
lock on the autocomplete.dictionary file. any reason why this is so?


thx,

running:
solr 3.5
win7










Re: Commit when a segment is written

2012-06-21 Thread Erick Erickson
I don't think autocommit is deprecated, it's just commented out of the config
and using commitWithin (assuming you're working from SolrJ) is preferred if
possible.

But what governs a particular set of docs? What are the criteria
that determine when
you want to commit? Flushes and commits are orthogonal. A segment is kept open
through multiple flushes. That is, there can be many flushes and the documents
still aren't searchable until the first commit (but it sounds like
you're aware of that).

Have you tried using autocommit? And what version of Solr are you using?

And finally, what is your use case for frequent commits? If you're going after
NRT functionality, have  you looked at the NRT stuff in 4.x?

Best
Erick

On Thu, Jun 21, 2012 at 8:01 AM, Ramprakash Ramamoorthy
youngestachie...@gmail.com wrote:
 Dear,

        I am using Lucene/Solr for my log search tool. Is there a way I can
 perform a commit operation on my IndexWriter when a particular set of docs
 is flushed from memory to the disk. My RamBufferSize is 24Mb and
 MergeFactor is 10.

        Or is calling commit in frequent intervals irrespective of the
 flushes the only way? I wish the autocommit  feature was not deprecated.


 --
 With Thanks and Regards,
 Ramprakash Ramamoorthy,
 Engineer Trainee,
 Zoho Corporation.
 +91 9626975420


Solr 4.0 with Near Real Time and Faceted Search in Replicated topology

2012-06-21 Thread Niran Fajemisin
Hi all,

We're thinking of moving forward with Solr 4.0 and we plan to have a master 
index server and at least two slaves servers. The Master server will be used 
primarily for indexing and the queries will be load balanced across to the 
replicated slave servers. I would like to know if, with the current support for 
Near Real Time search in 4.0, there's support for Faceted Search. Keeping in 
mind that the searches will be performed against the Slave servers and not the 
Master (indexing) server.

If it's not supported, will we need to use SolrCloud to gain the benefits of 
Near Real Time search when performing Faceted Searches?

Any insight would be greatly appreciated.

Thanks all! 

Re: solr java.lang.NullPointerException on select queries

2012-06-21 Thread avenka
Erick, much thanks for detailing these options. I am currently trying the
second one as that seems a little easier and quicker to me.

I successfully deleted documents with IDs after the problem time that I do
know to an accuracy of a couple hours. Now, the stats are:
  numDocs : 2132454075
  maxDoc : -2130733352 
The former is nicely below 2^31. But I can't seem to get the latter to
decrease and become positive by deleting further. 

Should I just run an optimize at this point? I have never manually run an
optimize and plan to just hit 
  http://machine_name/solr/update?optimize=true
Can you confirm this?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990798.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Exception using distributed field-collapsing

2012-06-21 Thread Young, Cody
Does it work in the non distributed case?

Is the field you're grouping on stored? What is the type on the uniqueKey 
field? Is it stored and indexed?

I've had a problem with distributed not working when the uniqueKey field was 
indexed but not stored.

Also, in distributed searches, the uniqueKey is used to retrieve documents from 
shards, so if were say, a date, that may be causing the issue. 

Cody

-Original Message-
From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] 
Sent: Wednesday, June 20, 2012 1:54 PM
To: solr-user@lucene.apache.org
Subject: RE: Exception using distributed field-collapsing

 Hi Bryan,

 What is the fieldtype of the groupField? You can only group by field 
 that is of type string as is described in the wiki:
 http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters

 When you group by another field type a http 400 should be returned 
 instead if this error. At least that what I'd expect.

 Martijn

Martijn,

The group-by field is a string. I have been unable to figure how a date comes 
into the picture at all, and have basically been wondering if there is some 
problem in the grouping code that misaligns the field values from different 
results in the group, so that it is not comparing like with like. Not a strong 
theory, just the only thing I can think of.

-- Bryan


RE: Exception using distributed field-collapsing

2012-06-21 Thread Bryan Loofbourrow
Cody,

 Does it work in the non distributed case?

Yes.


 Is the field you're grouping on stored? What is the type on the
uniqueKey
 field? Is it stored and indexed?

The field I'm grouping on is a string, stored and indexed. The unique key
field is a string, stored and indexed.

 I've had a problem with distributed not working when the uniqueKey field
 was indexed but not stored.

Was it the same exception I'm seeing?

-- Bryan


 -Original Message-
 From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com]
 Sent: Wednesday, June 20, 2012 1:54 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Exception using distributed field-collapsing

  Hi Bryan,
 
  What is the fieldtype of the groupField? You can only group by field
  that is of type string as is described in the wiki:
  http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters
 
  When you group by another field type a http 400 should be returned
  instead if this error. At least that what I'd expect.
 
  Martijn

 Martijn,

 The group-by field is a string. I have been unable to figure how a date
 comes into the picture at all, and have basically been wondering if
there
 is some problem in the grouping code that misaligns the field values
from
 different results in the group, so that it is not comparing like with
 like. Not a strong theory, just the only thing I can think of.

 -- Bryan


RE: Exception using distributed field-collapsing

2012-06-21 Thread Young, Cody
No, I believe it was a different exception, just brainstorming. (it was a null 
reference iirc)

Does a *:* query with no sorting work?

Cody

-Original Message-
From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com] 
Sent: Thursday, June 21, 2012 10:33 AM
To: solr-user@lucene.apache.org
Subject: RE: Exception using distributed field-collapsing

Cody,

 Does it work in the non distributed case?

Yes.


 Is the field you're grouping on stored? What is the type on the
uniqueKey
 field? Is it stored and indexed?

The field I'm grouping on is a string, stored and indexed. The unique key field 
is a string, stored and indexed.

 I've had a problem with distributed not working when the uniqueKey 
 field was indexed but not stored.

Was it the same exception I'm seeing?

-- Bryan


 -Original Message-
 From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com]
 Sent: Wednesday, June 20, 2012 1:54 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Exception using distributed field-collapsing

  Hi Bryan,
 
  What is the fieldtype of the groupField? You can only group by field 
  that is of type string as is described in the wiki:
  http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters
 
  When you group by another field type a http 400 should be returned 
  instead if this error. At least that what I'd expect.
 
  Martijn

 Martijn,

 The group-by field is a string. I have been unable to figure how a 
 date comes into the picture at all, and have basically been wondering 
 if
there
 is some problem in the grouping code that misaligns the field values
from
 different results in the group, so that it is not comparing like with 
 like. Not a strong theory, just the only thing I can think of.

 -- Bryan


Re: solr java.lang.NullPointerException on select queries

2012-06-21 Thread Erick Erickson
Right, if you optimize, at the end maxDocs should == numDocs.
Usually the document reclamation stuff is done when segments
merge, but that won't happen in this case since this index is
becoming static, so a manual optimize is probably indicated.

Something like this should also work, either way:
http://localhost:8983/solr/update?stream.body=optimize/

But be prepared to wait for a very long time.

I'd copy it somewhere else first just for safety's sake

Best
Erick

On Thu, Jun 21, 2012 at 12:52 PM, avenka ave...@gmail.com wrote:
 Erick, much thanks for detailing these options. I am currently trying the
 second one as that seems a little easier and quicker to me.

 I successfully deleted documents with IDs after the problem time that I do
 know to an accuracy of a couple hours. Now, the stats are:
  numDocs : 2132454075
  maxDoc : -2130733352
 The former is nicely below 2^31. But I can't seem to get the latter to
 decrease and become positive by deleting further.

 Should I just run an optimize at this point? I have never manually run an
 optimize and plan to just hit
  http://machine_name/solr/update?optimize=true
 Can you confirm this?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-java-lang-NullPointerException-on-select-queries-tp3989974p3990798.html
 Sent from the Solr - User mailing list archive at Nabble.com.


RE: Exception using distributed field-collapsing

2012-06-21 Thread Bryan Loofbourrow
 Does a *:* query with no sorting work?

Well, this is interesting. Leaving q= as it was, but removing the sort,
makes the whole thing work.

And if you were thinking of asking whether the sort field is a date, the
answer is yes, it's an indexed and stored DateField. It's also on the list
of fields whose values I am requesting with fl=.

So I guess this is likely to be the date that is somehow turning up in the
ClassCastException. Great suggestion. Thanks, Cody.

Now I'm wondering if anyone familiar with the Field Collapsing code can
see a possible vector for a bug, given this fleshing out of the bug
conditions.

-- Bryan

 -Original Message-
 From: Young, Cody [mailto:cody.yo...@move.com]
 Sent: Thursday, June 21, 2012 11:04 AM
 To: solr-user@lucene.apache.org
 Subject: RE: Exception using distributed field-collapsing

 No, I believe it was a different exception, just brainstorming. (it was
a
 null reference iirc)

 Does a *:* query with no sorting work?

 Cody

 -Original Message-
 From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com]
 Sent: Thursday, June 21, 2012 10:33 AM
 To: solr-user@lucene.apache.org
 Subject: RE: Exception using distributed field-collapsing

 Cody,

  Does it work in the non distributed case?

 Yes.

 
  Is the field you're grouping on stored? What is the type on the
 uniqueKey
  field? Is it stored and indexed?

 The field I'm grouping on is a string, stored and indexed. The unique
key
 field is a string, stored and indexed.

  I've had a problem with distributed not working when the uniqueKey
  field was indexed but not stored.

 Was it the same exception I'm seeing?

 -- Bryan

 
  -Original Message-
  From: Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com]
  Sent: Wednesday, June 20, 2012 1:54 PM
  To: solr-user@lucene.apache.org
  Subject: RE: Exception using distributed field-collapsing
 
   Hi Bryan,
  
   What is the fieldtype of the groupField? You can only group by field
   that is of type string as is described in the wiki:
   http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters
  
   When you group by another field type a http 400 should be returned
   instead if this error. At least that what I'd expect.
  
   Martijn
 
  Martijn,
 
  The group-by field is a string. I have been unable to figure how a
  date comes into the picture at all, and have basically been wondering
  if
 there
  is some problem in the grouping code that misaligns the field values
 from
  different results in the group, so that it is not comparing like with
  like. Not a strong theory, just the only thing I can think of.
 
  -- Bryan