Problem with Master

2011-04-29 Thread Ezequiel Calderara
Hello, i'm having some issues with replication in my production environment.

I have a master and 4 slaves.
I had some data indexed and was replicated successfully.

We are close to make the production environment public, so i deleted
the old data deleting the data folder in the master
Then i reloaded the master (from the manager of tomcat) hoping that
the slaves will get updated with the new empty index.

But when i enter to each slave in the replication page, i see that
they have the old index. Even if i manually tell them to replicate,
the replication count increases but there is no change in the indexed
data.

I checked the logs of the master and slaves and i see no error. I do
see the /replication request reaching to the master.
I put the SolrCore and ReplicationHandler log's level to FINEST.
Still nothing.

I went agains the slave with the command=details
and i saw a list of ReplicationList and FailedList. And the failed
list  indicating that the replication is failing. But i don't know why
and i don't know where to look for the error.

Thanks in advance, this is really urgent.


PD: I hope this goes thru the spam filter...


Re: Problem with Master

2011-04-29 Thread Ezequiel Calderara
Just to add more info... this is the result of a Replication / Command=Details

I'm really confused by the masterDetails/indexSize being 52 byts (its
correct), but the indexSize being 303.8 KB

  ?xml version=1.0 encoding=UTF-8 ?
- response
- lst name=responseHeader
  int name=status0/int
  int name=QTime15/int
  /lst
- lst name=details
  str name=indexSize303.8 KB/str
  str name=indexPathD:\Solr\data\solr\index/str
  arr name=commits /
  str name=isMasterfalse/str
  str name=isSlavetrue/str
  long name=indexVersion1301331343628/long
  long name=generation3/long
- lst name=slave
- lst name=masterDetails
  str name=indexSize52 bytes/str
  str name=indexPathD:\Solr\data\solr\index/str
- arr name=commits
- lst
  long name=indexVersion1304086785516/long
  long name=generation1/long
- arr name=filelist
  strsegments_1/str
  /arr
  /lst
  /arr
  str name=isMastertrue/str
  str name=isSlavefalse/str
  long name=indexVersion1304086785516/long
  long name=generation1/long
  /lst
  str name=masterUrlhttp://192.168.211.185:8787/solr/replication/str
  str name=pollInterval00:00:60/str
  str name=nextExecutionAtFri Apr 29 12:04:57 ART 2011/str
  str name=indexReplicatedAtFri Apr 29 12:03:57 ART 2011/str
- arr name=indexReplicatedAtList
  strFri Apr 29 12:03:57 ART 2011/str
  strFri Apr 29 12:02:57 ART 2011/str
  strFri Apr 29 12:01:57 ART 2011/str
  strFri Apr 29 12:00:57 ART 2011/str
  strFri Apr 29 11:59:57 ART 2011/str
  strFri Apr 29 11:58:57 ART 2011/str
  strFri Apr 29 11:57:57 ART 2011/str
  strFri Apr 29 11:56:57 ART 2011/str
  strFri Apr 29 11:55:57 ART 2011/str
  strFri Apr 29 11:54:57 ART 2011/str
  /arr
- arr name=replicationFailedAtList
  strFri Apr 29 12:03:57 ART 2011/str
  strFri Apr 29 12:02:57 ART 2011/str
  strFri Apr 29 12:01:57 ART 2011/str
  strFri Apr 29 12:00:57 ART 2011/str
  strFri Apr 29 11:59:57 ART 2011/str
  strFri Apr 29 11:58:57 ART 2011/str
  strFri Apr 29 11:57:57 ART 2011/str
  strFri Apr 29 11:56:57 ART 2011/str
  strFri Apr 29 11:55:57 ART 2011/str
  strFri Apr 29 11:54:57 ART 2011/str
  /arr
  str name=timesIndexReplicated44794/str
  str name=confFilesReplicated[solrconfig_slave.xml]/str
  str name=timesConfigReplicated1/str
  str name=confFilesReplicatedAt1301405968250/str
  str name=lastCycleBytesDownloaded0/str
  str name=timesFailed44792/str
  str name=replicationFailedAtFri Apr 29 12:03:57 ART 2011/str
  str name=previousCycleTimeInSeconds0/str
  str name=isPollingDisabledfalse/str
  str name=isReplicatingfalse/str
  /lst
  /lst
  str name=WARNINGThis response format is experimental. It is
likely to change in the future./str
  /response

On Fri, Apr 29, 2011 at 11:52 AM, Ezequiel Calderara ezech...@gmail.com wrote:
 Hello, i'm having some issues with replication in my production environment.

 I have a master and 4 slaves.
 I had some data indexed and was replicated successfully.

 We are close to make the production environment public, so i deleted
 the old data deleting the data folder in the master
 Then i reloaded the master (from the manager of tomcat) hoping that
 the slaves will get updated with the new empty index.

 But when i enter to each slave in the replication page, i see that
 they have the old index. Even if i manually tell them to replicate,
 the replication count increases but there is no change in the indexed
 data.

 I checked the logs of the master and slaves and i see no error. I do
 see the /replication request reaching to the master.
 I put the SolrCore and ReplicationHandler log's level to FINEST.
 Still nothing.

 I went agains the slave with the command=details
 and i saw a list of ReplicationList and FailedList. And the failed
 list  indicating that the replication is failing. But i don't know why
 and i don't know where to look for the error.

 Thanks in advance, this is really urgent.


 PD: I hope this goes thru the spam filter...




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Curl bulk XML

2011-04-13 Thread Ezequiel Calderara
From the post.jar i think that you can do something like...
java -jar post.jar A*.xml
java -jar post.jar B*.xml
java -jar post.jar C*.xml
java -jar post.jar D*.xml

(im in windows)

On Wed, Apr 13, 2011 at 4:41 PM, Markus Jelsma
markus.jel...@openindex.iowrote:

 Either put all documents in a large file or loop over them with a simple
 shell
 script.

  Hey guys, how do you curl update all the XML inside a folder from A-D?
  Example: curl http://localhost:8080/solr update
  Sent from my iPhone




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Problems indexing very large set of documents

2011-04-08 Thread Ezequiel Calderara
Maybe those files are created with a different Adobe Format version...

See this:
http://lucene.472066.n3.nabble.com/PDF-parser-exception-td644885.html

On Fri, Apr 8, 2011 at 12:14 PM, Brandon Waterloo 
brandon.water...@matrix.msu.edu wrote:

 A second test has revealed that it is something to do with the contents,
 and not the literal filenames, of the second set of files.  I renamed one of
 the second-format files and tested it and Solr still failed.  However, the
 problem still only applies to those files of the second naming format.
 
 From: Brandon Waterloo [brandon.water...@matrix.msu.edu]
 Sent: Friday, April 08, 2011 10:40 AM
 To: solr-user@lucene.apache.org
 Subject: RE: Problems indexing very large set of documents

 I had some time to do some research into the problems.  From what I can
 tell, it appears Solr is tripping up over the filename.  These are strictly
 examples, but, Solr handles this filename fine:

 32-130-A0-84-african_activist_archive-a0a6s3-b_12419.pdf

 However, it fails with either a parsing error or an EOF exception on this
 filename:

 32-130-A08-84-al.sff.document.nusa197102.pdf

 The only significant difference is that the second filename contains
 multiple periods.  As there are about 1700 files whose filenames are similar
 to the second format it is simply not possible to change their filenames.
  In addition they are being used by other applications.

 Is there something I can change in Solr configs to fix this issue or am I
 simply SOL until the Solr dev team can work on this? (assuming I put in a
 ticket)

 Thanks again everyone,

 ~Brandon Waterloo


 
 From: Chris Hostetter [hossman_luc...@fucit.org]
 Sent: Tuesday, April 05, 2011 3:03 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Problems indexing very large set of documents

 : It wasn't just a single file, it was dozens of files all having problems
 : toward the end just before I killed the process.
...
 : That is by no means all the errors, that is just a sample of a few.
 : You can see they all threw HTTP 500 errors.  What is strange is, nearly
 : every file succeeded before about the 2200-files-mark, and nearly every
 : file after that failed.

 ..the root question is: do those files *only* fail if you have already
 indexed ~2200 files, or do they fail if you start up your server and index
 them first?

 there may be a resource issued (if it only happens after indexing 2200) or
 it may just be a problem with a large number of your PDFs that your
 iteration code just happens to get to at that point.

 If it's the former, then there may e something buggy about how Solr is
 using Tika to cause the problem -- if it's the later, then it's a straight
 Tika parsing issue.

 :  now, commit is set to false to speed up the indexing, and I'm assuming
 that
 :  Solr should be auto-committing as necessary.  I'm using the default
 :  solrconfig.xml file included in apache-solr-1.4.1\example\solr\conf.
  Once

 solr does no autocommitting by default, you need to check your
 solrconfig.xml


 -Hoss




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Problems indexing very large set of documents

2011-04-08 Thread Ezequiel Calderara
Ohh sorry... didn't realize that they already sent you that link :P

On Fri, Apr 8, 2011 at 12:35 PM, Ezequiel Calderara ezech...@gmail.comwrote:

 Maybe those files are created with a different Adobe Format version...

 See this:
 http://lucene.472066.n3.nabble.com/PDF-parser-exception-td644885.html

 On Fri, Apr 8, 2011 at 12:14 PM, Brandon Waterloo 
 brandon.water...@matrix.msu.edu wrote:

 A second test has revealed that it is something to do with the contents,
 and not the literal filenames, of the second set of files.  I renamed one of
 the second-format files and tested it and Solr still failed.  However, the
 problem still only applies to those files of the second naming format.
 
 From: Brandon Waterloo [brandon.water...@matrix.msu.edu]
 Sent: Friday, April 08, 2011 10:40 AM
 To: solr-user@lucene.apache.org
 Subject: RE: Problems indexing very large set of documents

 I had some time to do some research into the problems.  From what I can
 tell, it appears Solr is tripping up over the filename.  These are strictly
 examples, but, Solr handles this filename fine:

 32-130-A0-84-african_activist_archive-a0a6s3-b_12419.pdf

 However, it fails with either a parsing error or an EOF exception on this
 filename:

 32-130-A08-84-al.sff.document.nusa197102.pdf

 The only significant difference is that the second filename contains
 multiple periods.  As there are about 1700 files whose filenames are similar
 to the second format it is simply not possible to change their filenames.
  In addition they are being used by other applications.

 Is there something I can change in Solr configs to fix this issue or am I
 simply SOL until the Solr dev team can work on this? (assuming I put in a
 ticket)

 Thanks again everyone,

 ~Brandon Waterloo


 
 From: Chris Hostetter [hossman_luc...@fucit.org]
 Sent: Tuesday, April 05, 2011 3:03 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Problems indexing very large set of documents

 : It wasn't just a single file, it was dozens of files all having problems
 : toward the end just before I killed the process.
...
 : That is by no means all the errors, that is just a sample of a few.
 : You can see they all threw HTTP 500 errors.  What is strange is, nearly
 : every file succeeded before about the 2200-files-mark, and nearly every
 : file after that failed.

 ..the root question is: do those files *only* fail if you have already
 indexed ~2200 files, or do they fail if you start up your server and index
 them first?

 there may be a resource issued (if it only happens after indexing 2200) or
 it may just be a problem with a large number of your PDFs that your
 iteration code just happens to get to at that point.

 If it's the former, then there may e something buggy about how Solr is
 using Tika to cause the problem -- if it's the later, then it's a straight
 Tika parsing issue.

 :  now, commit is set to false to speed up the indexing, and I'm assuming
 that
 :  Solr should be auto-committing as necessary.  I'm using the default
 :  solrconfig.xml file included in apache-solr-1.4.1\example\solr\conf.
  Once

 solr does no autocommitting by default, you need to check your
 solrconfig.xml


 -Hoss




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solr without Server / Search solutions with Solr on DVD (examples?)

2011-04-07 Thread Ezequiel Calderara
Can't you just run a jetty server on the background?

But probably some antivirus or antispyware could take that as an tojan or
something like that.

How many little main memory is? 1gb? less?

I don't think that you are going to have problems above 1gb. The index will
be static, no changes, no optimizations...

That's my thought

On Thu, Apr 7, 2011 at 11:12 AM, karsten-s...@gmx.de wrote:

 Hi folks,

 we want to migrate our search-portal to Solr.
 But some of our customers search in our informations offline with a
 DVD-Version.
 So we want to estimate the complexity of a Solr DVD-Version.
 This means to trim Solr to work on small computers with the opposite of
 heavy loads. So no server-optimizations, no Cache, less facet terms in
 memory...

 My question:
 Does anyone know examples of solutions with Solr starting from DVD?

 Is there a tutorial for “configure a slow Solr for Computer with little
 main memory”?

 Any best practice tips from yourself?


 Best regards
  Karsten




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solr without Server / Search solutions with Solr on DVD (examples?)

2011-04-07 Thread Ezequiel Calderara
Try setting a virtual machine and see its performance.

I'm really not a java guy, so i really don't know how to tune it for
performance...

But afaik solr handles pretty well in ram if the index is static...

On Thu, Apr 7, 2011 at 2:48 PM, Karsten Fissmer karsten-s...@gmx.de wrote:

 Hi yonik, Hi Ezequiel,

 Java is no problem for an DVD Version. We already have a DVD version with
 Servlet-Container (but this does currently not use Solr).

 Some of our customers work in public sector institutions and have less then
 1gb main memory, but they use MS Word and IE and..

 But let us say that we can set Xmx384m (we have 14m documents).
 Xmx384m with 14m UnitsOfRetrieval means e.g. that we do not allow the same
 fields for sorting as on server.

 My main interest is an example of other companies/product who delivered
 information on DVD with stand alone Solr.

 Best regards
  Karsten

  ---yonik
  Including a JRE on the DVD and a launch script that uses that JRE by
  default should be doable as well.
  -Yonik
  Jeffrey
  Even if you can ship your DVD with a jetty server, you'll still need
  JAVA
  installed on the customer machine...
 
  ---Karsten
  My question:
  Does anyone know examples of solutions with Solr starting from DVD?
  Is there a tutorial for “configure a slow Solr for Computer with little
 main memory”?
  Any best practice tips from yourself?




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Trying to Post. Emails rejected as spam.

2011-04-07 Thread Ezequiel Calderara
Happened to me a couple of times, couldn't find a way a workaround...

On Thu, Apr 7, 2011 at 4:14 PM, Parker Johnson pjoh...@yahoo.com wrote:


 Hello everyone.  Does anyone else have problems posting to the list?  My
 messages keep getting rejected with this response below.  I'll be surprised
 if
 this one makes it through :)

 -Park

 Sorry, we were unable to deliver your message to the following address.

 solr-user@lucene.apache.org:
 Remote  host said: 552 spam score (8.0) exceeded threshold

 (FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FS_REPLICA,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL
  ) [BODY]

 --- Below this line is a copy of the message.




-- 
__
Ezequiel.

Http://www.ironicnet.com


Solr: Images, Docs and Binary data

2011-04-06 Thread Ezequiel Calderara
Hello everyone, i need to know if some has used solr for indexing and
storing images (upt to 16MB) or binary docs.

How does solr behaves with this type of docs? How affects performance?

Thanks Everyone

-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solr: Images, Docs and Binary data

2011-04-06 Thread Ezequiel Calderara
Another question that maybe is easier to answer, how can i store binary
data? Any example schema?

2011/4/6 Ezequiel Calderara ezech...@gmail.com

 Hello everyone, i need to know if some has used solr for indexing and
 storing images (upt to 16MB) or binary docs.

 How does solr behaves with this type of docs? How affects performance?

 Thanks Everyone

 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solr: Images, Docs and Binary data

2011-04-06 Thread Ezequiel Calderara
Hi, your answers were really helpfull

I was thinking in putting the base64 encoded file into a string field. But
was a little worried about solr trying to stem it or vectorize or those
stuff.

Seen in the example of the schema.xml:
!--Binary data type. The data should be sent/retrieved in as Base64
encoded Strings --
fieldtype name=binary class=solr.BinaryField/

Anyone knows any storage for images that performs well, other than FS ?

Thanks


On Wed, Apr 6, 2011 at 3:31 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 Ha, there's a binary field type?!

 I've stored binary data in an ordinary String field type, and it's
 worked.  But there were some headaches to get it to work, might have been
 smoother if I had realized there was actually a binary field type.

 But wait I'm talking about Solr 'stored field', not about indexing. I
 didn't try to index my binary data, just store it for later retrieval
 (knowing this can sometimes be a performance problem, doing it anyway with
 relatively small data, got away with it).  Does the field type even effect
 the _stored values_ in a Solr field?


 On 4/6/2011 2:25 PM, Ryan McKinley wrote:

 You can store binary data using a binary field type -- then you need
 to send the data base64 encoded.

 I would strongly recommend against storing large binary files in solr
 -- unless you really don't care about performance -- the file system
 is a good option that springs to mind.

 ryan




 2011/4/6 Ezequiel Calderaraezech...@gmail.com:

 Another question that maybe is easier to answer, how can i store binary
 data? Any example schema?

 2011/4/6 Ezequiel Calderaraezech...@gmail.com

  Hello everyone, i need to know if some has used solr for indexing and
 storing images (upt to 16MB) or binary docs.

 How does solr behaves with this type of docs? How affects performance?

 Thanks Everyone

 --
 __
 Ezequiel.

 Http://www.ironicnet.com



 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solr: Images, Docs and Binary data

2011-04-06 Thread Ezequiel Calderara
On Wed, Apr 6, 2011 at 15:31 PM, Adam Estrada estrada.adam.gro...@gmail.com
 wrote:

 Well...by default there is a pretty decent schema that you can use as a
 template in the example project that builds with Solr. Tika is the library
 that does the actual content extraction so it would be a good idea to try
 the example project out first.


I wanted to know how large field's size affects performance.

But i wasn't sure how to design the schema.


On Wed, Apr 6, 2011 at 4:23 PM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Ezequiel,

 Am 06.04.2011 20:38, schrieb Ezequiel Calderara:

  Anyone knows any storage for images that performs well, other than FS ?


 you may have a look on http://www.danga.com/mogilefs/ ? :)

 Regards
 Stefan


Stefan, we looked at mogilefs, also couchdb and mongodb.
AFAIR (As Far as I Read :P), mogilefs runs on *nix OS, while we are using
microsoft as the OS. (yeah, we are the open source evangelist in our
company :P)

Just for the moment we well start using Solr for storing and indexing (some
info at least) images and docs. We have yet to see what are the needs in
terms of scalability to choose between the options.

Thanks all...
If you have more info send it :)

-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solrcore.properties

2011-03-29 Thread Ezequiel Calderara
Hi Jayendra, this is the content of the files:
In the Master:
 + SolrConfig.xml : http://pastebin.com/JhvwMTdd
In the Slave:
 + solrconfig.xml: http://pastebin.com/XPuwAkmW
 + solrcore.properties: http://pastebin.com/6HZhQG8z

I don't know which other files do you need or could be involved in this.

I checked the home environment key in the tomcat instance and its ok too.

Any light on this would be appreciated!


On Mon, Mar 28, 2011 at 6:26 PM, Jayendra Patil 
jayendra.patil@gmail.com wrote:

 Can you please attach the other files.
 It doesn't seem to find the enable.master property, so you may want to
 check the properties file exists on the box having issues

 We have the following configuration in the core :-

Core -
- solrconfig.xml - Master  Slave
requestHandler name=/replication
 class=solr.ReplicationHandler 
lst name=master
 str
 name=enable${enable.master:false}/str
 str
 name=replicateAftercommit/str
 str
 name=confFilessolrcore_slave.properties:solrcore.properties,solrconfig.xml,schema.xml/str
/lst
lst name=slave
 str
 name=enable${enable.slave:false}/str
 str
 name=masterUrlhttp://master_host:port/solr/corename/replication/str
/lst
/requestHandler

- solrcore.properties - Master
enable.master=true
enable.slave=false

- solrcore_slave.properties - Slave
enable.master=false
enable.slave=true

 We have the default values and separate properties file for Master and
 slave.
 Replication is enabled for the solrcore.proerties file.

 Regards,
 Jayendra

 On Mon, Mar 28, 2011 at 2:06 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all, i'm having problems when deploying solr in the production
 machines.
 
  I have a master solr, and 3 slaves.
  The master replicates the schema and the solrconfig for the slaves (this
  file in the master is named like solrconfig_slave.xml).
  The solrconfig of the slaves has for example the ${data.dir} and other
  values in the solrtcore.properties
 
  I think that solr isn't recognizing that file, because i get this error:
 
  HTTP Status 500 - Severe errors in solr configuration. Check your log
  files for more detailed information on what may be wrong. If you want
 solr
  to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in null
  -
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master at
  org.apache.solr.common.util.DOMUtil.substituteProperty(DOMUtil.java:311)
  ... MORE STACK TRACE INFO...
 
  But here is the thing:
  org.apache.solr.common.SolrException: No system property or default value
  specified for enable.master
 
  I'm attaching the master schema, the master solr config, the solr config
 of
  the slaves and the solrcore.properties.
 
  If anyone has any info on this i would be more than appreciated!...
 
  Thanks
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solrcore.properties

2011-03-29 Thread Ezequiel Calderara
I think that i found the problem:
The contents of the solrcore.properties were:

 #solrcore.properties
 data.dir=D:\Solr\data\solr\
 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

I found a folder in the D:\ called: SolrDatasolrenable.master=false
So i researched a little and tested another little more and i found that i
have to escape the data.dir like this:

 #solrcore.properties
 data.dir=D:\\Solr\\data\\solr\\
 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

And Problem solved, for now at least :P

On Tue, Mar 29, 2011 at 8:37 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 Hi Jayendra, this is the content of the files:
 In the Master:
  + SolrConfig.xml : http://pastebin.com/JhvwMTdd
 In the Slave:
  + solrconfig.xml: http://pastebin.com/XPuwAkmW
  + solrcore.properties: http://pastebin.com/6HZhQG8z

 I don't know which other files do you need or could be involved in this.

 I checked the home environment key in the tomcat instance and its ok too.

 Any light on this would be appreciated!


 On Mon, Mar 28, 2011 at 6:26 PM, Jayendra Patil 
 jayendra.patil@gmail.com wrote:

 Can you please attach the other files.
 It doesn't seem to find the enable.master property, so you may want to
 check the properties file exists on the box having issues

 We have the following configuration in the core :-

Core -
- solrconfig.xml - Master  Slave
requestHandler name=/replication
 class=solr.ReplicationHandler 
lst name=master
 str
 name=enable${enable.master:false}/str
 str
 name=replicateAftercommit/str
 str
 name=confFilessolrcore_slave.properties:solrcore.properties,solrconfig.xml,schema.xml/str
/lst
lst name=slave
 str
 name=enable${enable.slave:false}/str
 str
 name=masterUrlhttp://master_host:port/solr/corename/replication/str
/lst
/requestHandler

- solrcore.properties - Master
enable.master=true
enable.slave=false

- solrcore_slave.properties - Slave
enable.master=false
enable.slave=true

 We have the default values and separate properties file for Master and
 slave.
 Replication is enabled for the solrcore.proerties file.

 Regards,
 Jayendra

 On Mon, Mar 28, 2011 at 2:06 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all, i'm having problems when deploying solr in the production
 machines.
 
  I have a master solr, and 3 slaves.
  The master replicates the schema and the solrconfig for the slaves (this
  file in the master is named like solrconfig_slave.xml).
  The solrconfig of the slaves has for example the ${data.dir} and other
  values in the solrtcore.properties
 
  I think that solr isn't recognizing that file, because i get this error:
 
  HTTP Status 500 - Severe errors in solr configuration. Check your log
  files for more detailed information on what may be wrong. If you want
 solr
  to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in null
  -
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master at
 
 org.apache.solr.common.util.DOMUtil.substituteProperty(DOMUtil.java:311)
  ... MORE STACK TRACE INFO...
 
  But here is the thing:
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master
 
  I'm attaching the master schema, the master solr config, the solr config
 of
  the slaves and the solrcore.properties.
 
  If anyone has any info on this i would be more than appreciated!...
 
  Thanks
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com
 




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solrcore.properties

2011-03-29 Thread Ezequiel Calderara
Just for the record, in case anymore is having trouble, the masterUrl should
be: http://url:port/solr/replication (don't forget the /replication/ part!)

On Tue, Mar 29, 2011 at 9:44 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 I think that i found the problem:
 The contents of the solrcore.properties were:

 #solrcore.properties
 data.dir=D:\Solr\data\solr\

 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

 I found a folder in the D:\ called: SolrDatasolrenable.master=false
 So i researched a little and tested another little more and i found that i
 have to escape the data.dir like this:

 #solrcore.properties
 data.dir=D:\\Solr\\data\\solr\\

 enable.master=false
 enable.slave=true
 masterUrl=http://url:8787/solr/
 pollInterval=00:00:60

 And Problem solved, for now at least :P

 On Tue, Mar 29, 2011 at 8:37 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 Hi Jayendra, this is the content of the files:
 In the Master:
  + SolrConfig.xml : http://pastebin.com/JhvwMTdd
 In the Slave:
  + solrconfig.xml: http://pastebin.com/XPuwAkmW
  + solrcore.properties: http://pastebin.com/6HZhQG8z

 I don't know which other files do you need or could be involved in this.

 I checked the home environment key in the tomcat instance and its ok too.

 Any light on this would be appreciated!


 On Mon, Mar 28, 2011 at 6:26 PM, Jayendra Patil 
 jayendra.patil@gmail.com wrote:

 Can you please attach the other files.
 It doesn't seem to find the enable.master property, so you may want to
 check the properties file exists on the box having issues

 We have the following configuration in the core :-

Core -
- solrconfig.xml - Master  Slave
requestHandler name=/replication
 class=solr.ReplicationHandler 
lst name=master
 str
 name=enable${enable.master:false}/str
 str
 name=replicateAftercommit/str
 str
 name=confFilessolrcore_slave.properties:solrcore.properties,solrconfig.xml,schema.xml/str
/lst
lst name=slave
 str
 name=enable${enable.slave:false}/str
 str
 name=masterUrlhttp://master_host:port/solr/corename/replication/str
/lst
/requestHandler

- solrcore.properties - Master
enable.master=true
enable.slave=false

- solrcore_slave.properties - Slave
enable.master=false
enable.slave=true

 We have the default values and separate properties file for Master and
 slave.
 Replication is enabled for the solrcore.proerties file.

 Regards,
 Jayendra

 On Mon, Mar 28, 2011 at 2:06 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all, i'm having problems when deploying solr in the production
 machines.
 
  I have a master solr, and 3 slaves.
  The master replicates the schema and the solrconfig for the slaves
 (this
  file in the master is named like solrconfig_slave.xml).
  The solrconfig of the slaves has for example the ${data.dir} and other
  values in the solrtcore.properties
 
  I think that solr isn't recognizing that file, because i get this
 error:
 
  HTTP Status 500 - Severe errors in solr configuration. Check your log
  files for more detailed information on what may be wrong. If you want
 solr
  to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in null
  -
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master at
 
 org.apache.solr.common.util.DOMUtil.substituteProperty(DOMUtil.java:311)
  ... MORE STACK TRACE INFO...
 
  But here is the thing:
  org.apache.solr.common.SolrException: No system property or default
 value
  specified for enable.master
 
  I'm attaching the master schema, the master solr config, the solr
 config of
  the slaves and the solrcore.properties.
 
  If anyone has any info on this i would be more than appreciated!...
 
  Thanks
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com
 




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




 --
 __
 Ezequiel.

 Http://www.ironicnet.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: help with Solr installation within Tomcat7

2011-03-22 Thread Ezequiel Calderara
Where is your solr files (war, conf files) located? How did you instance
solr in tomcat?

On Tue, Mar 22, 2011 at 7:08 PM, Erick Erickson erickerick...@gmail.comwrote:

 What error are you receiving? Check your config files for any
 absolute rather than relative paths would be my first guess...

 Best
 Erick

 On Tue, Mar 22, 2011 at 10:09 AM,  ramdev.wud...@thomsonreuters.com
 wrote:
  Hi All:
I have just started using Solr and have it successfully installed
 within a Tomcat7 Webapp server.
  I have also indexed documents using the SolrJ interfaces. The following
 is my problem:
 
  I installed Solr under Tomcat7 folders and setup an xml configuration
 file to indicate the Solr home variables as detailed on the wiki (for Solr
 install within TOmcat)
  The indexes seem to reside within the solr_home folder under the data
 folder  (Solr_home/data/index )
 
  However when I make a zip copy of the the complete install (i.e. tomcat
 with Solr), and move it to a different machine and unzip/install it,
  The index seems to be inaccessible. (I did change the solr.xml
 configuration variables to point to the new location)
 
  From what I know, with tomcat installations, it should be as simple as
 zipping a current working installation and unzipping/installing  on a
 different machine/location.
 
  Am I missing something that makes Solr hardcode the path to the index
 in an install ?
 
  Simple plut, I would like to know how to transport an existing install
 of Solr within TOmcat 7 from one machine to another and still have it
 working.
 
  Ramdev=
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Dismax problem

2011-02-15 Thread Ezequiel Calderara
Hi, im having a problem while trying to do a dismax search.
For example i have the standard query url like this:
It returns 1 result.
But when i try to use the dismax query type i have the following error:

 15/02/2011 10:27:07 org.apache.solr.common.SolrException log
 GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28
 at
 org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721)
 at
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224)
 at
 org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692)
 at
 org.apache.solr.search.function.StringIndexDocValues.init(StringIndexDocValues.java:35)
 at
 org.apache.solr.search.function.OrdFieldSource$1.init(OrdFieldSource.java:84)
 at
 org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58)
 at
 org.apache.solr.search.function.FunctionQuery$AllScorer.init(FunctionQuery.java:123)
 at
 org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93)
 at
 org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
 at
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268)
 at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258)
 at org.apache.lucene.search.Searcher.search(Searcher.java:171)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
 at
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
 at
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163)
 at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
 at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556)
 at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401)
 at
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281)
 at
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579)
 at
 org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)


The Solr instance is running as a replication slave.
This is the solrconfig.xml: http://pastebin.com/GSv2wBB4
This is the schema.xml: http://pastebin.com/5VpRT5Jj

Any help? How can i find what is causing this exception? I thought that the
dismax didn't throw exceptions...
-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: please help Problem with dataImportHandler

2011-01-26 Thread Ezequiel Calderara
And the answer there didn't help?
Why do not copy the logs of this new error too?


Every time you encounter an error, take time to send the log output, and if
its needed the schema.xml or the solrconfig.xml

Thanks


On Tue, Jan 25, 2011 at 6:44 AM, Dinesh mdineshkuma...@karunya.edu.inwrote:



 http://lucene.472066.n3.nabble.com/Getting-started-with-writing-parser-tp2278092p2327738.html

 this thread explains my problem

 -
 DINESHKUMAR . M
 I am neither especially clever nor especially gifted. I am only very, very
 curious.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2327745.html
  Sent from the Solr - User mailing list archive at Nabble.com.




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: please help Problem with dataImportHandler

2011-01-24 Thread Ezequiel Calderara
This may be a dumb question, but Is the xml encoded in UTF-8?

On Mon, Jan 24, 2011 at 7:08 AM, Dinesh mdineshkuma...@karunya.edu.inwrote:


 this is the error that i'm getting.. no idea of what is it..


 /apache-solr-1.4.1/example/exampledocs# java -jar post.jar sample.txt
 SimplePostTool: version 1.2
 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8,
 other encodings are not currently supported
 SimplePostTool: POSTing files to http://localhost:8983/solr/update..
 SimplePostTool: POSTing file sample.txt
 SimplePostTool: FATAL: Solr returned an error:

 Severe_errors_in_solr_configuration__Check_your_log_files_for_more_detailed_information_on_what_may_be_wrong__If_you_want_solr_to_continue_after_configuration_errors_changeabortOnConfigurationErrorfalseabortOnConfigurationError__in_null___orgapachesolrhandlerdataimportDataImportHandlerException_Exception_occurred_while_initializing_context__at_orgapachesolrhandlerdataimportDataImporterloadDataConfigDataImporterjava190__at_orgapachesolrhandlerdataimportDataImporterinitDataImporterjava101__at_orgapachesolrhandlerdataimportDataImportHandlerinformDataImportHandlerjava113__at_orgapachesolrcoreSolrResourceLoaderinformSolrResourceLoaderjava508__at_orgapachesolrcoreSolrCoreinitSolrCorejava588__at_orgapachesolrcoreCoreContainer$InitializerinitializeCoreContainerjava137__at_orgapachesolrservletSolrDispatchFilterinitSolrDispatchFilterjava83__at_orgmortbayjettyservletFilterHolderdoStartFilterHolderjava99__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyservletServletHandlerinitializeServletHandlerjava594__at_orgmortbayjettyservletContextstartContextContextjava139__at_orgmortbayjettywebappWebAppContextstartContextWebAppContextjava1218__at_orgmortbayjettyhandlerContextHandlerdoStartContextHandlerjava500__at_orgmortbayjettywebappWebAppContextdoStartWebAppContextjava448__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbayjettyhandlerContextHandlerCollectiondoStartContextHandlerCollectionjava161__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhandlerHandlerCollectiondoStartHandlerCollectionjava147__at_orgmortbaycomponentAbstractLifeCyclestartAbstractLifeCyclejava40__at_orgmortbayjettyhan
 root@karunya-desktop:/home/karunya/apache-solr-1.4.1/example/exampledocs#


 -
 DINESHKUMAR . M
 I am neither especially clever nor especially gifted. I am only very, very
 curious.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2318585.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: please help Problem with dataImportHandler

2011-01-24 Thread Ezequiel Calderara
And what the logs says about it?

On Mon, Jan 24, 2011 at 7:15 AM, Dinesh mdineshkuma...@karunya.edu.inwrote:


 actually its a log file i seperately created an handler for that... its not
 XML

 -
 DINESHKUMAR . M
 I am neither especially clever nor especially gifted. I am only very, very
 curious.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2318617.html
  Sent from the Solr - User mailing list archive at Nabble.com.




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: please help Problem with dataImportHandler

2011-01-24 Thread Ezequiel Calderara
I mean, when you run the DIH, what's the output of the Solr Log ? Probably
there is more info about whats happening...
On Mon, Jan 24, 2011 at 10:28 AM, Dinesh mdineshkuma...@karunya.edu.inwrote:


 its a DHCP log.. i want ti index it

 -
 DINESHKUMAR . M
 I am neither especially clever nor especially gifted. I am only very, very
 curious.
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/please-help-Problem-with-dataImportHandler-tp2318585p2319627.html
  Sent from the Solr - User mailing list archive at Nabble.com.




-- 
__
Ezequiel.

Http://www.ironicnet.com


Backup and Recover strategy

2011-01-21 Thread Ezequiel Calderara
Hello, i just finished implementing a master with two slaves (this is a test
for now :P).

I'm trying to figure out how to do backups and restores without stopping the
service or using a passive slave.
Right now i'm backing up using the str name=backupAfteroptimize/str,
and it creates a snapshot folder for backup.
Is there any way to indicate where to backup, or some other options?

Is there any other way of doing backups without stopping the service?

Thanks all!
-- 
__
Ezequiel.

Http://www.ironicnet.com


Master and Slaves

2011-01-21 Thread Ezequiel Calderara
I have setup a Master with two slaves. Let's call the Master Jabba and the
slaves Leia and C3PO  (very nerdy! lol).
Well, i have setup in Jabba the replication, with the following confFiles
str
name=confFilessolrconfig_slave.xml:solrconfig.xml,schema.xml,stopwords.txt,elevate.xml/str

But in the slaves i want to override the dataDir value of the
solrconfig.xml, but it get overrided by the one replicated.
Is there a way to have the slaves having their solrconfig replicated, but
with some special configurations?

I want to avoid having to enter to each slave to configure it, i prefer to
do it in a centralized way.
-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-21 Thread Ezequiel Calderara
On Tue, Jan 18, 2011 at 6:04 PM, Grant Ingersoll gsing...@apache.orgwrote:

 Where do you get your Lucene/Solr downloads from?

 [X] ASF Mirrors (linked in our release announcements or via the Lucene
 website)

 [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)

 [] I/we build them from source via an SVN/Git checkout.

 [X] Other (someone in your company mirrors them internally or via a
 downstream project)




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Master and Slaves

2011-01-21 Thread Ezequiel Calderara
Thanks!, thats what i needed!

There is always some much to learn about Solr/Lucene!


On Fri, Jan 21, 2011 at 10:08 AM, Markus Jelsma
markus.jel...@openindex.iowrote:

 solrcore.properties




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Master and Slaves

2011-01-21 Thread Ezequiel Calderara
Somehow it's not working :(
i have set it up like:

  #solrcore.properties

  data.dir=D:\Solr\PAU\data

 But it keeps going to the dataDir configured in the solrconfig.xml.
Also, when i go to the replication admin i see this:
  *Master* http://10.11.33.180:8787/solr/replication  *Poll Interval* 00:00:60
 *Local Index* Index Version: 1295466104884, Generation: 4  Location:
C:\Program Files\Apache Software Foundation\Tomcat 7.0\data\index  Size:
6,99 KB  Times Replicated Since Startup: 50  Previous Replication Done At:
Fri Jan 21 11:36:19 ART 2011  *Config Files Replicated At: null * ** *Config
Files Replicated: null * ** *Times Config Files Replicated Since Startup:
null*  Next Replication Cycle At: Fri Jan 21 11:37:19 ART 2011

And i know that the files were replicated ok. i see the solrconfig backup
with name solrconfig.xml.20110120030345, and the datadir changed also...

So i don't understand why isn't figuring as replicated.
Maybe i'm doing something wrong. Don't know
On Fri, Jan 21, 2011 at 10:16 AM, Ezequiel Calderara ezech...@gmail.comwrote:

 Thanks!, thats what i needed!

 There is always some much to learn about Solr/Lucene!


 On Fri, Jan 21, 2011 at 10:08 AM, Markus Jelsma 
 markus.jel...@openindex.io wrote:

 solrcore.properties




 --
 __
 Ezequiel.

 Http://www.ironicnet.com http://www.ironicnet.com/




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Master and Slaves

2011-01-21 Thread Ezequiel Calderara
Ohh i see... i was setting a default value in the solrconfig_slave like
this:
dataDir${solr.data.dir:.\data}/dataDir


i will try the ${data.dir}


2011/1/21 Tomás Fernández Löbbe tomasflo...@gmail.com

 Did you modify the solrconfig file with:

 dataDir${data.dir}/dataDir

 ??


 On Fri, Jan 21, 2011 at 11:38 AM, Ezequiel Calderara ezech...@gmail.com
 wrote:

  Somehow it's not working :(
  i have set it up like:
 
#solrcore.properties
  
data.dir=D:\Solr\PAU\data
  
   But it keeps going to the dataDir configured in the solrconfig.xml.
  Also, when i go to the replication admin i see this:
   *Master* http://10.11.33.180:8787/solr/replication  *Poll Interval*
  00:00:60
   *Local Index* Index Version: 1295466104884, Generation: 4  Location:
  C:\Program Files\Apache Software Foundation\Tomcat 7.0\data\index  Size:
  6,99 KB  Times Replicated Since Startup: 50  Previous Replication Done
 At:
  Fri Jan 21 11:36:19 ART 2011  *Config Files Replicated At: null * **
  *Config
  Files Replicated: null * ** *Times Config Files Replicated Since Startup:
  null*  Next Replication Cycle At: Fri Jan 21 11:37:19 ART 2011
 
  And i know that the files were replicated ok. i see the solrconfig backup
  with name solrconfig.xml.20110120030345, and the datadir changed
 also...
 
  So i don't understand why isn't figuring as replicated.
  Maybe i'm doing something wrong. Don't know
  On Fri, Jan 21, 2011 at 10:16 AM, Ezequiel Calderara ezech...@gmail.com
  wrote:
 
   Thanks!, thats what i needed!
  
   There is always some much to learn about Solr/Lucene!
  
  
   On Fri, Jan 21, 2011 at 10:08 AM, Markus Jelsma 
   markus.jel...@openindex.io wrote:
  
   solrcore.properties
  
  
  
  
   --
   __
   Ezequiel.
  
   Http://www.ironicnet.com http://www.ironicnet.com/ 
 http://www.ironicnet.com/
   
 
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com http://www.ironicnet.com/
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Master and Slaves

2011-01-21 Thread Ezequiel Calderara
It worked! :)

On Fri, Jan 21, 2011 at 12:02 PM, Ezequiel Calderara ezech...@gmail.comwrote:

 Ohh i see... i was setting a default value in the solrconfig_slave like
 this:
 dataDir${solr.data.dir:.\data}/dataDir


 i will try the ${data.dir}


 2011/1/21 Tomás Fernández Löbbe tomasflo...@gmail.com

 Did you modify the solrconfig file with:

 dataDir${data.dir}/dataDir

 ??


 On Fri, Jan 21, 2011 at 11:38 AM, Ezequiel Calderara ezech...@gmail.com
 wrote:

  Somehow it's not working :(
  i have set it up like:
 
#solrcore.properties
  
data.dir=D:\Solr\PAU\data
  
   But it keeps going to the dataDir configured in the solrconfig.xml.
  Also, when i go to the replication admin i see this:
   *Master* http://10.11.33.180:8787/solr/replication  *Poll Interval*
  00:00:60
   *Local Index* Index Version: 1295466104884, Generation: 4  Location:
  C:\Program Files\Apache Software Foundation\Tomcat 7.0\data\index  Size:
  6,99 KB  Times Replicated Since Startup: 50  Previous Replication Done
 At:
  Fri Jan 21 11:36:19 ART 2011  *Config Files Replicated At: null * **
  *Config
  Files Replicated: null * ** *Times Config Files Replicated Since
 Startup:
  null*  Next Replication Cycle At: Fri Jan 21 11:37:19 ART 2011
 
  And i know that the files were replicated ok. i see the solrconfig
 backup
  with name solrconfig.xml.20110120030345, and the datadir changed
 also...
 
  So i don't understand why isn't figuring as replicated.
  Maybe i'm doing something wrong. Don't know
  On Fri, Jan 21, 2011 at 10:16 AM, Ezequiel Calderara 
 ezech...@gmail.com
  wrote:
 
   Thanks!, thats what i needed!
  
   There is always some much to learn about Solr/Lucene!
  
  
   On Fri, Jan 21, 2011 at 10:08 AM, Markus Jelsma 
   markus.jel...@openindex.io wrote:
  
   solrcore.properties
  
  
  
  
   --
   __
   Ezequiel.
  
   Http://www.ironicnet.com http://www.ironicnet.com/ 
 http://www.ironicnet.com/
   
 
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com http://www.ironicnet.com/
 




 --
  __
 Ezequiel.

 Http://www.ironicnet.com http://www.ironicnet.com/




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Can I host TWO separate datasets in Solr?

2011-01-21 Thread Ezequiel Calderara
you can configure it as two different instances in a tomcat server or keep
running two jetty apps... :P

On Fri, Jan 21, 2011 at 8:51 PM, Igor Chudov ichu...@gmail.com wrote:

 I would like to have two sets of data and search them separately (they are
 used for two different websites).

 How can I do it?

 Thanks!




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Query Problem

2010-12-17 Thread Ezequiel Calderara
Hi Erick, you were right.

I'm looking the source of the search result (instead of the render of
internet explorer :$) and i see this:
str name=SectionNameProgramas_Home
/str

So i think that is the problem is in the SSIS process that retrieves data
from the DB and sends it to solr.
The data type in the db is VARCHAR(100)... but i'm sure that somewhere is
mapping it to CHAR(100) so it's length its always 100.

Thank you very much, i will keep you informed

Thanks



On Thu, Dec 16, 2010 at 9:38 PM, Erick Erickson erickerick...@gmail.comwrote:

 OK, it works perfectly for me on a 1.4.1 instance. I've looked over your
 files a couple of times and see nothing obvious (but you'll never find
 anyone better at overlooking the obvious than me!).

 Tokenizing and stemming are irrelevant in this case because your
 type is string, which is an untokenizedtype so you don't need to
 go there.

 The way your query parses and analyzes backs this up, so you're
 getting to the right schema definition.

 Which may bring us to whether what's in the index is what you *think* is
 in there. I'm betting not. Either you changed the schema and didn't
 re-index
 (say changed index=false to index=true), you didn't commit the
 documents
 after indexing or other such-like, or changed the field type and didn't
 reindex.

 So go into /solr/admin. Click on schema browser, click on fields.
 Along
 the left you should see SectionName, click on that. That will show you
 the
 #indexed# terms, and you should see, exactly, Programas_Home in there,
 just
 like in your returned documents. Let us know if that's in fact what you do
 see. It's
 possible you're being mislead by the difference between seeing the value in
 a returned
 document (the stored value) and what's searched on (the indexed token(s)).

 And I'm assuming that some asterisks in your mails were really there for
 bolding and
 you are NOT doing wildcard searches for, for instance,
  *SectionName:Programas_Home*.

 But we're at a point where my 1.4.1 instance produces the results you're
 expecting,
 at least as I understand them so I don't think it's a problem with Solr,
 but
 some change
 you've made is producing results you don't expect but are correct. Like I
 said,
 look at the indexed terms. If you see Programas_Home in the admin console
 after
 following the steps above, then I don't know what to suggest

 Best
 Erick

 On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:

  The jars are named like *1.4.1* . So i suppose its the version 1.4.1
 
  Thanks!
 
  On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   OK, what version of Solr are you using? I can take a quick check to see
   what behavior I get
  
   Erick
  
   On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara 
 ezech...@gmail.com
   wrote:
  
I'll check the Tokenizer to see if that's the problem.
The results of Analysis Page for SectionName:Programas_Home
 Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {}
   term
position 1 term text Programas_Home term type word source start,end
  0,14
payload
   
So it's not having problems with that... Also in the debug you can
 see
   that
the parsed query is correct...
So i don't know where to look...
   
I know nothing about Stemming or tokenizing, but i will look if
 that
   has
anything to do.
   
If anyone can help me out, please do :D
   
   
   
   
On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson 
  erickerick...@gmail.com
wrote:
   
 Ezequiel:

 Nice job of including relevant details, by the way. Unfortunately
 I'm
 puzzled too. Your SectionName is a string type, so it should
 be placed in the index as-is. Be a bit cautious about looking at
 returned results (as I see in one of your xml files) because the
   returned
 values are the verbatim, stored field NOT what's tokenized, and the
 tokenized data is what's searched..

 That said, you SectionName should not be tokenized at all because
 it's a string type. Take a look at the admin page, schema browser
  and
 see what values for SectionName look (these will be the tokenized
 values. They should be exactly
 Programas_Name, complete with underscore, case changes, etc. Is
 that
 the case?

 Another place that might help is the admin/analysis page. Check the
   debug
 boxes and input your steps and it'll show you what the
  transformations
 are applied. But a quick look leaves me completely baffled.

 Sorry I can't be more help
 Erick

 On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara 
   ezech...@gmail.com
 wrote:

  Hi all, I have the following problems.
  I have this set of data (View data (Pastebin) 
  http://pastebin.com/jKbUhjVS
  )
  If i do a search for: *SectionName:Programas_Home* i have no
  results:
  Returned
  Data (PasteBin) http

Re: Query Problem

2010-12-17 Thread Ezequiel Calderara
Well... finally... isn't solr problem. Isn't solr config problem.
Is Microsoft's problem:
http://flyingtriangles.blogspot.com/2006/08/workaround-to-ssis-strings-are-not.html

Thank you very much erick!! you really helped on the solution of this!


On Fri, Dec 17, 2010 at 10:52 AM, Erick Erickson erickerick...@gmail.comwrote:

 Right, I *love* problems like this... NOT

 You might get some joy out of using TrimFilterFactory along with
 KeywordAnalyzer,
 something like this:
 fieldType name=trimField class=solr.TextField your options here 
 analyzer
 tokenizer class=solr.KeywordTokenizerFactory /
 filter class=solr.TrimFilterFactory /
 /analyzer
 /fieldType

 but it depends upon what your fields are padded with

 Best
 Erick

 On Fri, Dec 17, 2010 at 8:12 AM, Ezequiel Calderara ezech...@gmail.com
 wrote:

  Hi Erick, you were right.
 
  I'm looking the source of the search result (instead of the render of
  internet explorer :$) and i see this:
  str name=SectionNameProgramas_Home
  /str
 
  So i think that is the problem is in the SSIS process that retrieves data
  from the DB and sends it to solr.
  The data type in the db is VARCHAR(100)... but i'm sure that somewhere is
  mapping it to CHAR(100) so it's length its always 100.
 
  Thank you very much, i will keep you informed
 
  Thanks
 
 
 
  On Thu, Dec 16, 2010 at 9:38 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   OK, it works perfectly for me on a 1.4.1 instance. I've looked over
 your
   files a couple of times and see nothing obvious (but you'll never find
   anyone better at overlooking the obvious than me!).
  
   Tokenizing and stemming are irrelevant in this case because your
   type is string, which is an untokenizedtype so you don't need to
   go there.
  
   The way your query parses and analyzes backs this up, so you're
   getting to the right schema definition.
  
   Which may bring us to whether what's in the index is what you *think*
 is
   in there. I'm betting not. Either you changed the schema and didn't
   re-index
   (say changed index=false to index=true), you didn't commit the
   documents
   after indexing or other such-like, or changed the field type and didn't
   reindex.
  
   So go into /solr/admin. Click on schema browser, click on
 fields.
   Along
   the left you should see SectionName, click on that. That will show
 you
   the
   #indexed# terms, and you should see, exactly, Programas_Home in
 there,
   just
   like in your returned documents. Let us know if that's in fact what you
  do
   see. It's
   possible you're being mislead by the difference between seeing the
 value
  in
   a returned
   document (the stored value) and what's searched on (the indexed
  token(s)).
  
   And I'm assuming that some asterisks in your mails were really there
 for
   bolding and
   you are NOT doing wildcard searches for, for instance,
*SectionName:Programas_Home*.
  
   But we're at a point where my 1.4.1 instance produces the results
 you're
   expecting,
   at least as I understand them so I don't think it's a problem with
 Solr,
   but
   some change
   you've made is producing results you don't expect but are correct. Like
 I
   said,
   look at the indexed terms. If you see Programas_Home in the admin
  console
   after
   following the steps above, then I don't know what to suggest
  
   Best
   Erick
  
   On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara 
 ezech...@gmail.com
   wrote:
  
The jars are named like *1.4.1* . So i suppose its the version 1.4.1
   
Thanks!
   
On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson 
  erickerick...@gmail.com
wrote:
   
 OK, what version of Solr are you using? I can take a quick check to
  see
 what behavior I get

 Erick

 On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara 
   ezech...@gmail.com
 wrote:

  I'll check the Tokenizer to see if that's the problem.
  The results of Analysis Page for SectionName:Programas_Home
   Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer
 {}
 term
  position 1 term text Programas_Home term type word source
 start,end
0,14
  payload
 
  So it's not having problems with that... Also in the debug you
 can
   see
 that
  the parsed query is correct...
  So i don't know where to look...
 
  I know nothing about Stemming or tokenizing, but i will look if
   that
 has
  anything to do.
 
  If anyone can help me out, please do :D
 
 
 
 
  On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson 
erickerick...@gmail.com
  wrote:
 
   Ezequiel:
  
   Nice job of including relevant details, by the way.
 Unfortunately
   I'm
   puzzled too. Your SectionName is a string type, so it should
   be placed in the index as-is. Be a bit cautious about looking
 at
   returned results (as I see in one of your xml files) because
 the
 returned

Query Problem

2010-12-16 Thread Ezequiel Calderara
Hi all, I have the following problems.
I have this set of data (View data (Pastebin) http://pastebin.com/jKbUhjVS
)
If i do a search for: *SectionName:Programas_Home* i have no results: Returned
Data (PasteBin) http://pastebin.com/wnPdHqBm
If i do a search for: *Programas_Home* i have only 1 result: Result Returned
(Pastebin) http://pastebin.com/fMZkLvYK
if i do a search for: SectionName:Programa* i have 1 result: Result Returned
(Pastebin) http://pastebin.com/kLLnVp4b

This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this is my
*solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8 ?(PasteBin)

I don't understand why when searching for SectionName:Programas_Home isn't
returning any results at all...

Can someone send some light on this?
-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Query Problem

2010-12-16 Thread Ezequiel Calderara
I'll check the Tokenizer to see if that's the problem.
The results of Analysis Page for SectionName:Programas_Home
 Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {}  term
position 1 term text Programas_Home term type word source start,end 0,14
payload

So it's not having problems with that... Also in the debug you can see that
the parsed query is correct...
So i don't know where to look...

I know nothing about Stemming or tokenizing, but i will look if that has
anything to do.

If anyone can help me out, please do :D




On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.comwrote:

 Ezequiel:

 Nice job of including relevant details, by the way. Unfortunately I'm
 puzzled too. Your SectionName is a string type, so it should
 be placed in the index as-is. Be a bit cautious about looking at
 returned results (as I see in one of your xml files) because the returned
 values are the verbatim, stored field NOT what's tokenized, and the
 tokenized data is what's searched..

 That said, you SectionName should not be tokenized at all because
 it's a string type. Take a look at the admin page, schema browser and
 see what values for SectionName look (these will be the tokenized
 values. They should be exactly
 Programas_Name, complete with underscore, case changes, etc. Is that
 the case?

 Another place that might help is the admin/analysis page. Check the debug
 boxes and input your steps and it'll show you what the transformations
 are applied. But a quick look leaves me completely baffled.

 Sorry I can't be more help
 Erick

 On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:

  Hi all, I have the following problems.
  I have this set of data (View data (Pastebin) 
  http://pastebin.com/jKbUhjVS
  )
  If i do a search for: *SectionName:Programas_Home* i have no results:
  Returned
  Data (PasteBin) http://pastebin.com/wnPdHqBm
  If i do a search for: *Programas_Home* i have only 1 result: Result
  Returned
  (Pastebin) http://pastebin.com/fMZkLvYK
  if i do a search for: SectionName:Programa* i have 1 result: Result
  Returned
  (Pastebin) http://pastebin.com/kLLnVp4b
 
  This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and this
 is
  my
  *solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8
 ?(PasteBin)
  
  I don't understand why when searching for SectionName:Programas_Home
  isn't
  returning any results at all...
 
  Can someone send some light on this?
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com http://www.ironicnet.com/
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Query Problem

2010-12-16 Thread Ezequiel Calderara
The jars are named like *1.4.1* . So i suppose its the version 1.4.1

Thanks!

On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson erickerick...@gmail.comwrote:

 OK, what version of Solr are you using? I can take a quick check to see
 what behavior I get

 Erick

 On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:

  I'll check the Tokenizer to see if that's the problem.
  The results of Analysis Page for SectionName:Programas_Home
   Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {}  term
  position 1 term text Programas_Home term type word source start,end 0,14
  payload
 
  So it's not having problems with that... Also in the debug you can see
 that
  the parsed query is correct...
  So i don't know where to look...
 
  I know nothing about Stemming or tokenizing, but i will look if that
 has
  anything to do.
 
  If anyone can help me out, please do :D
 
 
 
 
  On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   Ezequiel:
  
   Nice job of including relevant details, by the way. Unfortunately I'm
   puzzled too. Your SectionName is a string type, so it should
   be placed in the index as-is. Be a bit cautious about looking at
   returned results (as I see in one of your xml files) because the
 returned
   values are the verbatim, stored field NOT what's tokenized, and the
   tokenized data is what's searched..
  
   That said, you SectionName should not be tokenized at all because
   it's a string type. Take a look at the admin page, schema browser and
   see what values for SectionName look (these will be the tokenized
   values. They should be exactly
   Programas_Name, complete with underscore, case changes, etc. Is that
   the case?
  
   Another place that might help is the admin/analysis page. Check the
 debug
   boxes and input your steps and it'll show you what the transformations
   are applied. But a quick look leaves me completely baffled.
  
   Sorry I can't be more help
   Erick
  
   On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara 
 ezech...@gmail.com
   wrote:
  
Hi all, I have the following problems.
I have this set of data (View data (Pastebin) 
http://pastebin.com/jKbUhjVS
)
If i do a search for: *SectionName:Programas_Home* i have no results:
Returned
Data (PasteBin) http://pastebin.com/wnPdHqBm
If i do a search for: *Programas_Home* i have only 1 result: Result
Returned
(Pastebin) http://pastebin.com/fMZkLvYK
if i do a search for: SectionName:Programa* i have 1 result: Result
Returned
(Pastebin) http://pastebin.com/kLLnVp4b
   
This is my *schema* http://pastebin.com/PQM8uap4 (Pastebin) and
 this
   is
my
*solrconfig* http://%3c/?xml version=1.0 encoding=UTF-8
   ?(PasteBin)

I don't understand why when searching for
 SectionName:Programas_Home
isn't
returning any results at all...
   
Can someone send some light on this?
--
__
Ezequiel.
   
Http://www.ironicnet.com http://www.ironicnet.com/ 
 http://www.ironicnet.com/

  
 
 
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com http://www.ironicnet.com/
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Best practice for emailing this list?

2010-11-10 Thread Ezequiel Calderara
Mmmm maybe its your mail address? :P
Weird, i didn't have any problem with it using gmail...

Send in plain text, avoid links or links... maybe that could work...

If you want, send me the mail and i will forward it to the list, just to
test!

On Wed, Nov 10, 2010 at 3:59 PM, robo - robom...@gmail.com wrote:

 No matter how much I limit my other email it will not get through the
 Solr mailing spam filter.  This has to be the most frustrating mailing
 list I have ever tried to work with.  All I need are some answers on
 replication and load balancing but I can't even get it to the list.


 On Wed, Nov 10, 2010 at 10:17 AM, Ken Stanley doh...@gmail.com wrote:
  On Wed, Nov 10, 2010 at 1:11 PM, robo - robom...@gmail.com wrote:
  How do people email this list without getting spam filter problems?
 
 
  Depends on which side of the spam filter that you're referring to.
  I've found that to keep these emails from entering my spam filter is
  to add a rule to Gmail that says Never send to spam. As for when I
  send emails, I make sure that I send my emails as plain text to avoid
  getting bounce backs.
 
  - Ken
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Best practice for emailing this list?

2010-11-10 Thread Ezequiel Calderara
Tried to forward the mail of robomon but had the same error:
Delivery to the following recipient failed permanently:
solr-user@lucene.apache.org
Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the recipient
domain. We recommend contacting the other email provider for further
information about the cause of this error. The error that the other server
returned was: 552 552 spam score (5.8) exceeded threshold (state 18).
- Original message -



On Wed, Nov 10, 2010 at 4:12 PM, Ezequiel Calderara ezech...@gmail.comwrote:

 Mmmm maybe its your mail address? :P
 Weird, i didn't have any problem with it using gmail...

 Send in plain text, avoid links or links... maybe that could work...

 If you want, send me the mail and i will forward it to the list, just to
 test!

   On Wed, Nov 10, 2010 at 3:59 PM, robo - robom...@gmail.com wrote:

 No matter how much I limit my other email it will not get through the
 Solr mailing spam filter.  This has to be the most frustrating mailing
 list I have ever tried to work with.  All I need are some answers on
 replication and load balancing but I can't even get it to the list.


 On Wed, Nov 10, 2010 at 10:17 AM, Ken Stanley doh...@gmail.com wrote:
  On Wed, Nov 10, 2010 at 1:11 PM, robo - robom...@gmail.com wrote:
  How do people email this list without getting spam filter problems?
 
 
  Depends on which side of the spam filter that you're referring to.
  I've found that to keep these emails from entering my spam filter is
  to add a rule to Gmail that says Never send to spam. As for when I
  send emails, I make sure that I send my emails as plain text to avoid
  getting bounce backs.
 
  - Ken
 




 --
 __
 Ezequiel.

 Http://www.ironicnet.com http://www.ironicnet.com/




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Updating Solr index - DIH delta vs. task queues

2010-11-04 Thread Ezequiel Calderara
I'm in the same scenario, so this answer would be helpful too..
I'm adding...

3) Web Service - Request a webservice for all the new data that has been
updated (can this be done?
On Thu, Nov 4, 2010 at 2:38 PM, Andy angelf...@yahoo.com wrote:

 Hi,
 I have data stored in a database that is being updated constantly. I need
 to find a way to update Solr index as data in the database is being updated.
 There seems to be 2 main schools of thoughts on this:
 1) DIH delta - query the database for all records that have a timestamp
 later than the last_index_time. Import those records for indexing to Solr
 2) Task queue - every time a record is updated in the database, throw a
 task to a queue to index that record to Solr
 Just want to know what are the pros and cons of each approach and what is
 your experience. For someone starting new, what'd be your recommendation?
 ThanksAndy







-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Custom Sorting in Solr

2010-11-01 Thread Ezequiel Calderara
Ok i imagined that the double linked list would be far too complicated for
solr.

Now, how can i achieve that solr connects to a webservice and do the import?

I'm sorry if i'm not clear, sometimes my english gets fuzzy :P

On Fri, Oct 29, 2010 at 4:51 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Fri, Oct 29, 2010 at 3:39 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:
  Hi all guys!
  I'm in a weird situation here.
  We have index a set of documents which are ordered using a linked list
 (each
  documents has the reference of the previous and the next).
 
  Is there a way when sorting in the solr search, Use the linked list to
 sort?

 It seems like you should be able to encode this linked list as an
 integer instead, and sort by that?
 If there are multiple linked lists in the index, it seems like you
 could even use the high bits of the int to designate which list the
 doc belongs to, and the low order bits as the order in that list.

 -Yonik
 http://www.lucidimagination.com




-- 
__
Ezequiel.

Http://www.ironicnet.com


Custom Sorting in Solr

2010-10-29 Thread Ezequiel Calderara
Hi all guys!
I'm in a weird situation here.
We have index a set of documents which are ordered using a linked list (each
documents has the reference of the previous and the next).

Is there a way when sorting in the solr search, Use the linked list to sort?


If that is not possible, how can i use the DIH to access a Service in WCF or
a Webservice? Should i develop my own DIH?


-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: How to delete a SOLR document if that particular data doesnt exist in DB?

2010-10-20 Thread Ezequiel Calderara
Can't you in each delete of that data, save the ids in other table?
And then process those ids against solr to delete them?
On Wed, Oct 20, 2010 at 11:51 AM, bbarani bbar...@gmail.com wrote:


 Hi,

 I have a very common question but couldnt find any post related to my
 question in this forum,

 I am currently initiating a full import each week but the data that have
 been deleted in the source is not update in my document as I am using
 clean=false.

 We are indexing multiple data by data types hence cant delete the index and
 do a complete re-indexing each week also we want to delete the orphan solr
 documents (for which the data is not present in back end DB) on a daily
 basis.

 Now my question is.. Is there a way I can use preImportDeleteQuery to
 delete
 the documents from SOLR for which the data doesnt exist in back end db? I
 dont have anything called delete status in DB, instead I need to get all
 the
 UID's from SOLR document and compare it with all the UID's in back end and
 delete the data from SOLR document for the UID's which is not present in
 DB.

 Any suggestion / ideas would be of great help.

 Note: Currently I have developed a simple program which will fetch the
 UID's
 from SOLR document and then connect to backend DB to check the orphan UID's
 and delete the documents from SOLR index corresponding to orphan UID's. I
 just dont want to re-invent the wheel if this feature is already present in
 SOLR as I need to do more testing in terms of performance / scalability for
 my program..

 Thanks,
 Barani


 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-delete-a-SOLR-document-if-that-particular-data-doesnt-exist-in-DB-tp1739222p1739222.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: How to delete a SOLR document if that particular data doesnt exist in DB?

2010-10-20 Thread Ezequiel Calderara
Also you can set an expiration policy maybe, and delete files that expire
after some time and aren't older than other... but i don't know if you can
iterate over the existing ids...

On Wed, Oct 20, 2010 at 1:34 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/20/2010 9:59 AM, bbarani wrote:

 We actually use virtual DB modelling tool to fetch the data from various
 sources during run time hence we dont have any control over the source.

 We consolidate the data from more than one source and index the
 consolidated
 data using SOLR. We dont have any kind of update / access rights to source
 data.


 It seems likely that those who are in control of the data sources would be
 maintaining some kind of delete log, and that they should be able to make
 those logs available to you.

 For my index, the data comes from a MySQL database.  When a delete is done
 at the database level, a database trigger records the old information to a
 main delete log table, as well as a separate table for the search system.
  The build system uses that separate table to run deletes every ten minutes
 and keeps it trimmed to 48 hours of delete history.





-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: query results file for trec_eval

2010-10-19 Thread Ezequiel Calderara
I don't know anything about the TREC format document, but i think if you
want text output, you can do it by using the
http://wiki.apache.org/solr/XsltResponseWriter to transform the xml to a
text...

On Tue, Oct 19, 2010 at 12:29 PM, Valli Indraganti 
valli.indraga...@gmail.com wrote:

 Hello!

 I am a student and I am trying to run evaluation for TREC format document.
 I
 have the judgments. I would like to have the output of my queries for use
 with trec_eval software. Can someone please point me how to make Solr spit
 out output in this format? Or at least point me to some material that
 guides
 me through this.

 Thanks,
 Valli




-- 
__
Ezequiel.

Http://www.ironicnet.com


Commits on service after shutdown

2010-10-18 Thread Ezequiel Calderara
 Hi, i'm new in the mailing list.
I'm implementing Solr in my actual job, and i'm having some problems.
I was testing the consistency of the commits. I found for example that if
we add X documents to the index (without commiting) and then we restart the
service, the documents are commited. They show up in the results. This is
interpreted to me like an error.
But when we add X documents to the index (without commiting) and then we
kill the process and we start it again, the documents doesn't appear. This
behaviour is the one i want.

Is there any param to avoid the auto-committing of documents after a
shutdown?
Is there any param to keep those un-commited documents alive after a kill?

Thanks!

-- 
__
Ezequiel.

Http://www.ironicnet.com http://www.ironicnet.com/


Re: Commits on service after shutdown

2010-10-18 Thread Ezequiel Calderara
I understand, but i want to have control of what is commit or not.
In our scenario, we want to add documents to the index, and maybe after an
hour trigger the commit.

If in the middle, we have a server shutdown or any process sending a
Shutdown signal to the process. I don't want those documents being commited.

Should i file a bug issue or an enhacement issue?.

Thanks


On Mon, Oct 18, 2010 at 3:54 PM, Israel Ekpo israele...@gmail.com wrote:

 The documents should be implicitly committed when the Lucene index is
 closed.

 When you perform a graceful shutdown, the Lucene index gets closed and the
 documents get committed implicitly.

 When the shutdown is abrupt as in a KILL -9, then this does not happen and
 the updates are lost.

 You can use the auto commit parameter when sending your updates so that the
 changes are saved right away, thought this could slow down the indexing
 speed considerably but I do not believe there are parameters to keep those
 un-commited documents alive after a kill.



 On Mon, Oct 18, 2010 at 2:46 PM, Ezequiel Calderara ezech...@gmail.com
 wrote:

   Hi, i'm new in the mailing list.
  I'm implementing Solr in my actual job, and i'm having some problems.
  I was testing the consistency of the commits. I found for example that
 if
  we add X documents to the index (without commiting) and then we restart
 the
  service, the documents are commited. They show up in the results. This is
  interpreted to me like an error.
  But when we add X documents to the index (without commiting) and then we
  kill the process and we start it again, the documents doesn't appear.
 This
  behaviour is the one i want.
 
  Is there any param to avoid the auto-committing of documents after a
  shutdown?
  Is there any param to keep those un-commited documents alive after a
  kill?
 
  Thanks!
 
  --
  __
  Ezequiel.
 
  Http://www.ironicnet.com http://www.ironicnet.com/ 
 http://www.ironicnet.com/
 



 --
 °O°
 Good Enough is not good enough.
 To give anything less than your best is to sacrifice the gift.
 Quality First. Measure Twice. Cut Once.
 http://www.israelekpo.com/




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Commits on service after shutdown

2010-10-18 Thread Ezequiel Calderara
But if something happens in between that hour, i will have lost or committed
the documents to the index out of the schedule.

How can i handle this scenario?

I think that Solr (or Lucene) should make sure of the
durabilityhttp://en.wikipedia.org/wiki/Durability_(database_systems)of
the data even if its in an uncommited state.
On Mon, Oct 18, 2010 at 4:53 PM, Matthew Hall mh...@informatics.jax.orgwrote:

  No.. you would just turn autocommit off, and have the thread that is doing
 updates to your indexes commit every hour.   I'd think that this would take
 care of the scenario that you are describing.

 Matt


 On 10/18/2010 3:50 PM, Ezequiel Calderara wrote:

 I understand, but i want to have control of what is commit or not.
 In our scenario, we want to add documents to the index, and maybe after an
 hour trigger the commit.

 If in the middle, we have a server shutdown or any process sending a
 Shutdown signal to the process. I don't want those documents being
 commited.

 Should i file a bug issue or an enhacement issue?.

 Thanks


 On Mon, Oct 18, 2010 at 3:54 PM, Israel Ekpoisraele...@gmail.com
  wrote:

 The documents should be implicitly committed when the Lucene index is
 closed.

 When you perform a graceful shutdown, the Lucene index gets closed and
 the
 documents get committed implicitly.

 When the shutdown is abrupt as in a KILL -9, then this does not happen
 and
 the updates are lost.

 You can use the auto commit parameter when sending your updates so that
 the
 changes are saved right away, thought this could slow down the indexing
 speed considerably but I do not believe there are parameters to keep
 those
 un-commited documents alive after a kill.



 On Mon, Oct 18, 2010 at 2:46 PM, Ezequiel Calderaraezech...@gmail.com

 wrote:
  Hi, i'm new in the mailing list.
 I'm implementing Solr in my actual job, and i'm having some problems.
 I was testing the consistency of the commits. I found for example that

 if

 we add X documents to the index (without commiting) and then we restart

 the

 service, the documents are commited. They show up in the results. This
 is
 interpreted to me like an error.
 But when we add X documents to the index (without commiting) and then we
 kill the process and we start it again, the documents doesn't appear.

 This

 behaviour is the one i want.

 Is there any param to avoid the auto-committing of documents after a
 shutdown?
 Is there any param to keep those un-commited documents alive after a
 kill?

 Thanks!

 --
 __
 Ezequiel.

 Http://www.ironicnet.com http://www.ironicnet.com/
 http://www.ironicnet.com/  

 http://www.ironicnet.com/


 --
 °O°
 Good Enough is not good enough.
 To give anything less than your best is to sacrifice the gift.
 Quality First. Measure Twice. Cut Once.
 http://www.israelekpo.com/







-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Commits on service after shutdown

2010-10-18 Thread Ezequiel Calderara
I'll see if i can resolve this adding an extra core with the same schema for
holding this documents.
So, Core0 will act as a Queue and the Core1 will be the real index. And
the commit in the core0 will trigger an add to the core1 and its commit.
That way i can be sure of not losing data.

It surprises me that solr doesn't have this feature built-in. I still have
to verify the perfomance, but looks good to me.

Anyway, any help would be appreciated.


On Mon, Oct 18, 2010 at 5:05 PM, Ezequiel Calderara ezech...@gmail.comwrote:

 But if something happens in between that hour, i will have lost or
 committed the documents to the index out of the schedule.

 How can i handle this scenario?

 I think that Solr (or Lucene) should make sure of the 
 durabilityhttp://en.wikipedia.org/wiki/Durability_(database_systems)of the 
 data even if its in an uncommited state.
   On Mon, Oct 18, 2010 at 4:53 PM, Matthew Hall mh...@informatics.jax.org
  wrote:

  No.. you would just turn autocommit off, and have the thread that is
 doing updates to your indexes commit every hour.   I'd think that this would
 take care of the scenario that you are describing.

 Matt


 On 10/18/2010 3:50 PM, Ezequiel Calderara wrote:

 I understand, but i want to have control of what is commit or not.
 In our scenario, we want to add documents to the index, and maybe after
 an
 hour trigger the commit.

 If in the middle, we have a server shutdown or any process sending a
 Shutdown signal to the process. I don't want those documents being
 commited.

 Should i file a bug issue or an enhacement issue?.

 Thanks


 On Mon, Oct 18, 2010 at 3:54 PM, Israel Ekpoisraele...@gmail.com
  wrote:

 The documents should be implicitly committed when the Lucene index is
 closed.

 When you perform a graceful shutdown, the Lucene index gets closed and
 the
 documents get committed implicitly.

 When the shutdown is abrupt as in a KILL -9, then this does not happen
 and
 the updates are lost.

 You can use the auto commit parameter when sending your updates so that
 the
 changes are saved right away, thought this could slow down the indexing
 speed considerably but I do not believe there are parameters to keep
 those
 un-commited documents alive after a kill.



 On Mon, Oct 18, 2010 at 2:46 PM, Ezequiel Calderaraezech...@gmail.com

 wrote:
  Hi, i'm new in the mailing list.
 I'm implementing Solr in my actual job, and i'm having some problems.
 I was testing the consistency of the commits. I found for example
 that

 if

 we add X documents to the index (without commiting) and then we restart

 the

 service, the documents are commited. They show up in the results. This
 is
 interpreted to me like an error.
 But when we add X documents to the index (without commiting) and then
 we
 kill the process and we start it again, the documents doesn't appear.

 This

 behaviour is the one i want.

 Is there any param to avoid the auto-committing of documents after a
 shutdown?
 Is there any param to keep those un-commited documents alive after a
 kill?

 Thanks!

 --
 __
 Ezequiel.

 Http://www.ironicnet.com http://www.ironicnet.com/
 http://www.ironicnet.com/  

 http://www.ironicnet.com/


 --
 °O°
 Good Enough is not good enough.
 To give anything less than your best is to sacrifice the gift.
 Quality First. Measure Twice. Cut Once.
 http://www.israelekpo.com/







 --
  __
 Ezequiel.

 Http://www.ironicnet.com http://www.ironicnet.com/




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Spell checking question from a Solr novice

2010-10-18 Thread Ezequiel Calderara
You can cross the new words against a dictionary and keep them in the file
as Jason described...

What Pradeep said is true, is always better to have suggestions related to
your index that have suggestions with no results...


On Mon, Oct 18, 2010 at 6:24 PM, Jason Blackerby jblacke...@gmail.comwrote:

 If you know the misspellings you could prevent them from being added to the
 dictionary with a StopFilterFactory like so:

fieldType name=textSpell class=solr.TextField
 positionIncrementGap=100 
  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true
 words=misspelled_words.txt/
filter class=solr.PatternReplaceFilterFactory pattern=([^a-z])
 replacement= replace=all/
filter class=solr.LengthFilterFactory min=2 max=50/
  /analyzer
/fieldType

 where misspelled_words.txt contains the misspellings.

 On Mon, Oct 18, 2010 at 5:14 PM, Pradeep Singh pksing...@gmail.com
 wrote:

  I think a spellchecker based on your index has clear advantages. You can
  spellcheck words specific to your domain which may not be available in an
  outside dictionary. You can always dump the list from wordnet to get a
  starter english dictionary.
 
  But then it also means that misspelled words from your domain become the
  suggested correct word. Hmmm ... you'll need to have a way to prune out
  such
  words. Even then, your own domain based dictionary is a total go.
 
  On Mon, Oct 18, 2010 at 1:55 PM, Jonathan Rochkind rochk...@jhu.edu
  wrote:
 
   In general, the benefit of the built-in Solr spellcheck is that it can
  use
   a dictionary based on your actual index.
  
   If you want to use some external API, you certainly can, in your actual
   client app -- but it doesn't really need to involve Solr at all
 anymore,
   does it?  Is there any benefit I'm not thinking of to doing that on the
  solr
   side, instead of just in your client app?
  
   I think Yahoo (and maybe Microsoft?) have similar APIs with more
 generous
   ToSs, but I haven't looked in a while.
  
  
   Xin Li wrote:
  
   Oops, never mind. Just read Google API policy. 1000 queries per day
  limit
for non-commercial use only.
  
  
   -Original Message-
   From: Xin Li Sent: Monday, October 18, 2010 3:43 PM
   To: solr-user@lucene.apache.org
   Subject: Spell checking question from a Solr novice
  
   Hi,
   I am looking for a quick solution to improve a search engine's spell
   checking performance. I was wondering if anyone tried to integrate
  Google
   SpellCheck API with Solr search engine (if possible). Google
 spellcheck
  came
   to my mind because of two reasons. First, it is costly to clean up the
  data
   to be used as spell check baseline. Secondly, google probably has the
  most
   complete set of misspelled search terms. That's why I would like to
 know
  if
   it is a feasible way to go.
  
   Thanks,
   Xin
   This electronic mail message contains information that (a) is or may
 be
   CONFIDENTIAL, PROPRIETARY IN NATURE, OR OTHERWISE PROTECTED BY LAW
 FROM
   DISCLOSURE, and (b) is intended only for the use of the
   addressee(s) named herein.  If you are not an intended recipient,
 please
   contact the sender immediately and take the steps necessary to delete
  the
   message completely from your computer system.
  
   Not Intended as a Substitute for a Writing: Notwithstanding the
 Uniform
   Electronic Transaction Act or any other law of similar effect, absent
 an
   express statement to the contrary, this e-mail message, its contents,
  and
   any attachments hereto are not intended to represent an offer or
  acceptance
   to enter into a contract and are not otherwise intended to bind this
  sender,
   barnesandnoble.com llc, barnesandnoble.com inc. or any other person
 or
   entity.
   This electronic mail message contains information that (a) is or may
 be
   CONFIDENTIAL, PROPRIETARY IN NATURE, OR OTHERWISE PROTECTED BY LAW
 FROM
   DISCLOSURE, and (b) is intended only for the use of the
   addressee(s) named herein.  If you are not an intended recipient,
 please
   contact the sender immediately and take the steps necessary to delete
  the
   message completely from your computer system.
  
   Not Intended as a Substitute for a Writing: Notwithstanding the
 Uniform
   Electronic Transaction Act or any other law of similar effect, absent
 an
   express statement to the contrary, this e-mail message, its contents,
  and
   any attachments hereto are not intended to represent an offer or
  acceptance
   to enter into a contract and are not otherwise intended to bind this
  sender,
   barnesandnoble.com llc, barnesandnoble.com inc. or any other person
 or
   entity.
  
  
  
 




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: I need to indexing the first character of a field in another field

2010-10-18 Thread Ezequiel Calderara
How are you declaring the transformer in the dataconfig?

On Mon, Oct 18, 2010 at 6:31 PM, Renato Wesenauer 
renato.wesena...@gmail.com wrote:

 Hello guys,

 I need to indexing the first character of the field autor in another
 field
 inicialautor.
 Example:
   autor = Mark Webber
   inicialautor = M

 I did a javascript function in the dataimport, but the field  inicialautor
 indexing empty.

 The function:

function InicialAutor(linha) {
var aut = linha.get(autor);
if (aut != null) {
  if (aut.length  0) {
  var ch = aut.charAt(0);
  linha.put(inicialautor, ch);
  }
  else {
  linha.put(inicialautor, '');
  }
}
else {
linha.put(inicialautor, '');
}
return linha;
}

 What's wrong?

 Thank's,

 Renato Wesenauer




-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Admin for spellchecker?

2010-10-18 Thread Ezequiel Calderara
i was thinking about, you also would need to mark a word like valid, so it
doesn't mark it as wrong.


On Mon, Oct 18, 2010 at 6:37 PM, Pradeep Singh pksing...@gmail.com wrote:

 Do we need an admin screen for spellchecker? Where you can browse the words
 and delete the ones you don't like so that they don't get suggested?




-- 
__
Ezequiel.

Http://www.ironicnet.com