Post PDF to solr with asp.net

2011-01-27 Thread Andrew McCombe
Hi

We are trying to post some PDF documents to solr for indexing using ASP.net
but cannot find any documentation or a library that will allow posting of
binary data.

Has anyone done this and if so, how?

Regards
Andrew McCombe
iWeb Solutions Ltd.


Re: Problem using curl in PHP to get Solr results

2010-12-15 Thread Andrew McCombe
Hi

You could use Solr's php serialized object output (wt=phps) and then convert
it to json in your php:
?php
echo json_encode(unserialize($results_from_solr));
?

Regards
Andrew McCombe

On 15 December 2010 17:49, Dennis Gearon gear...@sbcglobal.net wrote:

 I want to just pass the JSON through after qualifying the user's access to
 the
 site.


 Didn't want to spend the horse power to receive it as PHP array syntax, run
 the
 risk of someone putting bad stuff in the contents and running 'exec()' on
 it,
 and then spending the extra horsepower to putput it as json.

 I had that page up in the browwser to look at it later. If it deons't do
 the
 above, I will be glad to have the Solr access abstracted, thanks :-)


  Dennis Gearon


 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better
 idea to learn from others’ mistakes, so you do not have to make them
 yourself.
 from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


 EARTH has a Right To Life,
 otherwise we all die.



 - Original Message 
 From: Stephen Weiss swe...@stylesight.com
 To: solr-user@lucene.apache.org
 Sent: Wed, December 15, 2010 1:36:11 AM
 Subject: Re: Problem using curl in PHP to get Solr results

 Forgive me if this seems like a dumb question but have you tried the
 Apache_Solr_Service class?

 http://www.ibm.com/developerworks/library/os-php-apachesolr/index.html

 It's really quite good at handling the nuts and bolts of making the HTTP
 requests and decoding the responses for PHP.  I almost always use it when
 working from PHP.  It's all over Google so I don't know how someone would
 miss
 it but I don't know why else someone would bother curling a GET to SOLR
 otherwise.

 --
 Steve

 On Dec 15, 2010, at 4:22 AM, Dennis Gearon wrote:

  I finally figured out how to use curl to GET results, i.e. just turn all
 spaces
 
  into '%20' in my type of queries. I'm using solar spatial, and then
 searching
 in
 
  both the default text field and a couple of columns. Works fine on in the
  browser.
 
  But if I query for it using curl in PHP, there's an error somewhere in
 the
 JSON.
 
  I don't know if it's in the PHP food chain or something else.
 
 
  Just putting my solution to GETing from curl in PHP and my problem up
 here, for
 
  others to find.
 
  Of course, if anyone knows the answer, all the better.
 
  Dennis Gearon
 
 
  Signature Warning
  
  It is always a good idea to learn from your own mistakes. It is usually a
 better
 
  idea to learn from others’ mistakes, so you do not have to make them
 yourself.

  from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'
 
 
  EARTH has a Right To Life,
  otherwise we all die.
 



Re: Load cores without restarting/reloading Solr

2010-07-21 Thread Andrew McCombe
Hi Peter

We are using the packaged Ubuntu Server  (10.04 LTS) versions of Tomcat6 and
Solr1.4 and running a single instance of Solr with multiple cores.

Regards
Andrew

On 20 July 2010 19:47, Peter Karich peat...@yahoo.de wrote:

 Hi Andrew,

 the whole tomcat shouldn't fail on restart if only one core fails.
 We are using the setup described here:
 http://wiki.apache.org/solr/SolrTomcat

 With the help of several different Tomcat Context xml files (under
 conf/Catalina/localhost/) the cores should be independent webapps:
 A different data directory (+config) and even a different solr version
 is possible.

 Or are you using the same setup?

 Regards,
 Peter.

  Hi
 
  Sorry, it wasn't very clear was it? [?]
 
  Yes, I use a 'template' core that isn't used and create a copy of this on
  the command line. I then edit the newcore/conf/solrconfig.xml and set the
  data path, add data-import sections etc and then I edit the
  solr.home/solr.xml and add the core name  directory to that.  I then go
 to
  the Tomcat manager/html and reload Solr.
 
  The problem I get is that if I have broken something in the new core Solr
  (correctly) doesn't reload and the other cores aren't then working.
 
  I don't need replication just yet but I will be looking into that
  eventually.
 
  Regards
  Andrew
 
 
  On 20 July 2010 10:32, Peter Karich peat...@yahoo.de wrote:
 
 
  Hi Andrew,
 
  I didn't correctly understand what you are trying to do with 'copying'?
  Just use one core as a template or use it to replicate data?
 
  You can reload only one application via:
  http://localhost/manager/html/reload?path=/yourapp
  (if you do this often you need to increase the PermGen space)
 
  You can replicate a core:
  http://wiki.apache.org/solr/SolrReplication
 
  Regards,
  Peter.
 
 
  Hi
 
  We have a few cores set up for separate sites and one of these is in
 use
  constantly.  When I add a new core I can currently copying one of the
 
  other
 
  cores and renaming it, changing the conf etc and then reloading Solr
 via
 
  the
 
  tomcat manager.  However, if something goes wrong then the other cores
 
  stop
 
  working until I have resolved the problem.
 
  My questions are:
 
  1) Is using a separate core for different sites the correct method?
 
  2) Is there a way of creating a core and starting it without having to
  reload Solr or restart tomcat?
 
  3) I've looked at the Solr Cores CREATE handler but from what I gather,
 I
  need to create the core folder and edit the solr.xml first before
 loading
  the core with action=CREATE. Is that correct?
 
  Regards
  Andrew
 
 
 




Load cores without restarting/reloading Solr

2010-07-20 Thread Andrew McCombe
Hi

We have a few cores set up for separate sites and one of these is in use
constantly.  When I add a new core I can currently copying one of the other
cores and renaming it, changing the conf etc and then reloading Solr via the
tomcat manager.  However, if something goes wrong then the other cores stop
working until I have resolved the problem.

My questions are:

1) Is using a separate core for different sites the correct method?

2) Is there a way of creating a core and starting it without having to
reload Solr or restart tomcat?

3) I've looked at the Solr Cores CREATE handler but from what I gather, I
need to create the core folder and edit the solr.xml first before loading
the core with action=CREATE. Is that correct?

Regards
Andrew


Re: Load cores without restarting/reloading Solr

2010-07-20 Thread Andrew McCombe
Hi

Sorry, it wasn't very clear was it? [?]

Yes, I use a 'template' core that isn't used and create a copy of this on
the command line. I then edit the newcore/conf/solrconfig.xml and set the
data path, add data-import sections etc and then I edit the
solr.home/solr.xml and add the core name  directory to that.  I then go to
the Tomcat manager/html and reload Solr.

The problem I get is that if I have broken something in the new core Solr
(correctly) doesn't reload and the other cores aren't then working.

I don't need replication just yet but I will be looking into that
eventually.

Regards
Andrew


On 20 July 2010 10:32, Peter Karich peat...@yahoo.de wrote:

 Hi Andrew,

 I didn't correctly understand what you are trying to do with 'copying'?
 Just use one core as a template or use it to replicate data?

 You can reload only one application via:
 http://localhost/manager/html/reload?path=/yourapp
 (if you do this often you need to increase the PermGen space)

 You can replicate a core:
 http://wiki.apache.org/solr/SolrReplication

 Regards,
 Peter.

  Hi
 
  We have a few cores set up for separate sites and one of these is in use
  constantly.  When I add a new core I can currently copying one of the
 other
  cores and renaming it, changing the conf etc and then reloading Solr via
 the
  tomcat manager.  However, if something goes wrong then the other cores
 stop
  working until I have resolved the problem.
 
  My questions are:
 
  1) Is using a separate core for different sites the correct method?
 
  2) Is there a way of creating a core and starting it without having to
  reload Solr or restart tomcat?
 
  3) I've looked at the Solr Cores CREATE handler but from what I gather, I
  need to create the core folder and edit the solr.xml first before loading
  the core with action=CREATE. Is that correct?
 
  Regards
  Andrew
 
 


 --
 http://karussell.wordpress.com/




Security/authentication strategies

2010-04-29 Thread Andrew McCombe
Hi

I'm planning on adding some protection to our solr servers and would
like to know what others are doing in this area.

Basically I have a few solr cores running under tomcat6 and all use DH
to populate the solr index.  This is all behind a firewall and only
accessible from certain IP addresses.  Access to Solr Admin is open to
anyone in the company and many use it for checking data is in the
index and simple analysis.  However, they can also trigger a
full-import if they are careless (one of the cores takes 6 hours to
ingest the data).

What would be the recommended way of protecting things like the DIH
functionality? HTTP Authentication via tomcat realms or are there any
other solutions?

Thanks
Andrew McCombe
iWeb Solutions


Re: Security/authentication strategies

2010-04-29 Thread Andrew McCombe
Thanks for this Peter.  I have managed to get this working with Tomcat.

Andrew

On 29 April 2010 12:11, Peter Sturge peter.stu...@googlemail.com wrote:
 Hi Andrew,

 Today, authentication is handled by the container (e.g. Tomcat, Jetty etc.).


 There's a thread I found to be very useful on this topic here:

 http://www.lucidimagination.com/search/document/d1e338dc452db2e4/how_can_i_protect_the_solr_cores

 This was for Jetty, but the idea is pretty much the same for Tomcat.

 HTH

 Peter



 On Thu, Apr 29, 2010 at 8:42 AM, Andrew McCombe eupe...@gmail.com wrote:

 Hi

 I'm planning on adding some protection to our solr servers and would
 like to know what others are doing in this area.

 Basically I have a few solr cores running under tomcat6 and all use DH
 to populate the solr index.  This is all behind a firewall and only
 accessible from certain IP addresses.  Access to Solr Admin is open to
 anyone in the company and many use it for checking data is in the
 index and simple analysis.  However, they can also trigger a
 full-import if they are careless (one of the cores takes 6 hours to
 ingest the data).

 What would be the recommended way of protecting things like the DIH
 functionality? HTTP Authentication via tomcat realms or are there any
 other solutions?

 Thanks
 Andrew McCombe
 iWeb Solutions




Re: Unable to load MailEntityProcessor or org.apache.solr.handler.dataimport.MailEntityProcessor

2010-04-06 Thread Andrew McCombe
Hi Lance

Thanks for this.  The wiki definitely isn't clear about this. I will
test this tonight.

Regards
Andrew

On 5 April 2010 23:04, Lance Norskog goks...@gmail.com wrote:
 The MailEntityProcessor is an extra and does not come normally with
 the DataImportHandler. The wiki page should mention this.

 In the Solr distribution it should be in the dist/ directory as
 dist/apache-solr-dataimporthandler-extras-1.4.jar. The class it wants
 is in this jar . (Do 'unzip -l jar' to find the classes inside a jar.)

 You have to make a lib/ directory in the Solr core you are using, and
 copy this jar into there.

 On Mon, Apr 5, 2010 at 1:15 PM, Andrew McCombe eupe...@gmail.com wrote:
 Hi

 Can no-one help me with this?

 Andrew

 On 2 April 2010 22:24, Andrew McCombe eupe...@gmail.com wrote:
 Hi

 I am experimenting with Solr to index my gmail and am experiencing an error:

 'Unable to load MailEntityProcessor or
 org.apache.solr.handler.dataimport.MailEntityProcessor'

 I downloaded a fresh 1.4 tgz, extracted it and added the following to
 example/solr/config/solrconfig.xml:


 requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
    lst name=defaults
      str 
 name=config/home/andrew/bin/apache-solr-1.5-dev/example/solr/conf/email-data-config.xml/str
    /lst
  /requestHandler

 email-data-config.xml containd the following:

 dataConfig
 document name=mailindex
   entity processor=MailEntityProcessor
           user=eupe...@gmail.com
           password=xx
           host=imap.gmail.com
           protocol=imaps
           folders = inbox/
 /document
 /dataConfig

 Whenever I try to import data using /dataimport?command=full-import I
 am seeing the error below:

 Apr 2, 2010 10:14:51 PM
 org.apache.solr.handler.dataimport.DataImporter doFullImport
 SEVERE: Full Import failed
 org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
 to load EntityProcessor implementation for entity:11418758786959
 Processing Document # 1
        at 
 org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:805)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:536)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:261)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185)
        at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
        at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
        at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
 Caused by: java.lang.ClassNotFoundException: Unable to load
 MailEntityProcessor or
 org.apache.solr.handler.dataimport.MailEntityProcessor
        at 
 org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:966)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:802)
        ... 6 more
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'MailEntityProcessor'
        at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:956)
        ... 7 more
 Caused by: java.lang.ClassNotFoundException: MailEntityProcessor
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:592)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:357)
        ... 8 more
 Apr 2, 2010 10:14:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: start rollback
 Apr 2, 2010 10:14:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: end_rollback


 Am I missing a step somewhere? I have tried this with the standard
 apache 1.4, a nightly of 1.5 and also the LucidWorks release and get
 the same issue with each.  The wiki isn't very detailed either. My
 backbground isn't in Java so a lot of this is new to me.


 Regards
 Andrew McCombe





 --
 Lance Norskog
 goks...@gmail.com



Re: Unable to load MailEntityProcessor or org.apache.solr.handler.dataimport.MailEntityProcessor

2010-04-05 Thread Andrew McCombe
Hi

Can no-one help me with this?

Andrew

On 2 April 2010 22:24, Andrew McCombe eupe...@gmail.com wrote:
 Hi

 I am experimenting with Solr to index my gmail and am experiencing an error:

 'Unable to load MailEntityProcessor or
 org.apache.solr.handler.dataimport.MailEntityProcessor'

 I downloaded a fresh 1.4 tgz, extracted it and added the following to
 example/solr/config/solrconfig.xml:


 requestHandler name=/dataimport
 class=org.apache.solr.handler.dataimport.DataImportHandler
    lst name=defaults
      str 
 name=config/home/andrew/bin/apache-solr-1.5-dev/example/solr/conf/email-data-config.xml/str
    /lst
  /requestHandler

 email-data-config.xml containd the following:

 dataConfig
 document name=mailindex
   entity processor=MailEntityProcessor
           user=eupe...@gmail.com
           password=xx
           host=imap.gmail.com
           protocol=imaps
           folders = inbox/
 /document
 /dataConfig

 Whenever I try to import data using /dataimport?command=full-import I
 am seeing the error below:

 Apr 2, 2010 10:14:51 PM
 org.apache.solr.handler.dataimport.DataImporter doFullImport
 SEVERE: Full Import failed
 org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
 to load EntityProcessor implementation for entity:11418758786959
 Processing Document # 1
        at 
 org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:805)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:536)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:261)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185)
        at 
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
        at 
 org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
        at 
 org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
 Caused by: java.lang.ClassNotFoundException: Unable to load
 MailEntityProcessor or
 org.apache.solr.handler.dataimport.MailEntityProcessor
        at 
 org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:966)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:802)
        ... 6 more
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'MailEntityProcessor'
        at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
        at 
 org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:956)
        ... 7 more
 Caused by: java.lang.ClassNotFoundException: MailEntityProcessor
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:592)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:357)
        ... 8 more
 Apr 2, 2010 10:14:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: start rollback
 Apr 2, 2010 10:14:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback
 INFO: end_rollback


 Am I missing a step somewhere? I have tried this with the standard
 apache 1.4, a nightly of 1.5 and also the LucidWorks release and get
 the same issue with each.  The wiki isn't very detailed either. My
 backbground isn't in Java so a lot of this is new to me.


 Regards
 Andrew McCombe



Unable to load MailEntityProcessor or org.apache.solr.handler.dataimport.MailEntityProcessor

2010-04-02 Thread Andrew McCombe
Hi

I am experimenting with Solr to index my gmail and am experiencing an error:

'Unable to load MailEntityProcessor or
org.apache.solr.handler.dataimport.MailEntityProcessor'

I downloaded a fresh 1.4 tgz, extracted it and added the following to
example/solr/config/solrconfig.xml:


requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
  str 
name=config/home/andrew/bin/apache-solr-1.5-dev/example/solr/conf/email-data-config.xml/str
/lst
  /requestHandler

email-data-config.xml containd the following:

dataConfig
document name=mailindex
   entity processor=MailEntityProcessor
   user=eupe...@gmail.com
   password=xx
   host=imap.gmail.com
   protocol=imaps
   folders = inbox/
/document
/dataConfig

Whenever I try to import data using /dataimport?command=full-import I
am seeing the error below:

Apr 2, 2010 10:14:51 PM
org.apache.solr.handler.dataimport.DataImporter doFullImport
SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to load EntityProcessor implementation for entity:11418758786959
Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at 
org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:805)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:536)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:261)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
Caused by: java.lang.ClassNotFoundException: Unable to load
MailEntityProcessor or
org.apache.solr.handler.dataimport.MailEntityProcessor
at 
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:966)
at 
org.apache.solr.handler.dataimport.DocBuilder.getEntityProcessor(DocBuilder.java:802)
... 6 more
Caused by: org.apache.solr.common.SolrException: Error loading class
'MailEntityProcessor'
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
at 
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:956)
... 7 more
Caused by: java.lang.ClassNotFoundException: MailEntityProcessor
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:592)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:357)
... 8 more
Apr 2, 2010 10:14:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback
INFO: start rollback
Apr 2, 2010 10:14:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback
INFO: end_rollback


Am I missing a step somewhere? I have tried this with the standard
apache 1.4, a nightly of 1.5 and also the LucidWorks release and get
the same issue with each.  The wiki isn't very detailed either. My
backbground isn't in Java so a lot of this is new to me.


Regards
Andrew McCombe


Re: error while using the DIH handler

2010-02-23 Thread Andrew McCombe
Hi

You'll find this in the exception:

Caused by: java.lang.RuntimeException: Can't find resource
'/solr/data-config.xml' in classpath or './example-DIH/solr/db/conf/',
cwd=/home/zaloni/Desktop/apache-solr-1.4.0/example

Have you checked that the data-config.xml is in the right place?

Regards
Andrew

On 23 February 2010 12:47, Na_D nabam...@zaloni.com wrote:



 i tried using the DIH in solr using the steps as mentioned in :

 http://wiki.apache.org/solr/DataImportHandler#datasource
 http://wiki.apache.org/solr/DataImportHandler#datasource

 on running the command ::java -Dsolr.solr.home=./example-DIH/solr/ -jar
 start.jar

  i.e.  when i use the DIH/example its gives an error %-| (below)

 ...
 HTTP ERROR: 500

 Severe errors in solr configuration.

 Check your log files for more detailed information on what may be wrong.

 If you want solr to continue after configuration errors, change:

  abortOnConfigurationErrorfalse/abortOnConfigurationError

 in null

 -
 org.apache.solr.common.SolrException: FATAL: Could not create importer.
 DataImporter config invalid
at

 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:121)
at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486)
at org.apache.solr.core.SolrCore.init(SolrCore.java:588)
at

 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at
 org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at

 org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
at

 org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218)
at
 org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
at
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at

 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147)
at

 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at

 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at
 org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
at org.mortbay.jetty.Server.doStart(Server.java:210)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.mortbay.start.Main.invokeMain(Main.java:183)
at org.mortbay.start.Main.start(Main.java:497)
at org.mortbay.start.Main.main(Main.java:115)
 Caused by: java.lang.RuntimeException: Can't find resource
 '/solr/data-config.xml' in classpath or './example-DIH/solr/db/conf/',
 cwd=/home/zaloni/Desktop/apache-solr-1.4.0/example
at

 org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260)
at

 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:113)
... 28 more
 -
 java.lang.RuntimeException: Can't find resource '/solr/data-config.xml' in
 classpath or './example-DIH/solr/db/conf/',
 cwd=/home/zaloni/Desktop/apache-solr-1.4.0/example
at

 org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260)
at

 org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:113)
at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486)
at org.apache.solr.core.SolrCore.init(SolrCore.java:588)
at

 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at
 org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
at

 org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
at 

Re: SOLR Performance Tuning: Disable INFO Logging.

2009-12-21 Thread Andrew McCombe
Hi

Can you quickly explain what you did to disable INFO-Level?

I am from a PHP background and am not so well versed in Tomcat or
Java.  Is this a section in solrconfig.xml or did you have to edit
Solr Java source and recompile?

Thanks In Advance
Andrew


2009/12/20 Fuad Efendi f...@efendi.ca:
 After researching how to configure default SOLR  Tomcat logging, I finally
 disabled INFO-level for SOLR.

 And performance improved at least 7 times!!! ('at least 7' because I
 restarted server 5 minutes ago; caches are not prepopulated yet)

 Before that, I had 300-600 ms in HTTPD log files in average, and 4%-8% I/O
 wait whenever top commands shows SOLR on top.

 Now, I have 50ms-100ms in average (total response time logged by HTTPD).


 P.S.
 Of course, I am limited in RAM, and I use slow SATA... server is moderately
 loaded, 5-10 requests per second.


 P.P.S.
 And suddenly synchronous I/O by Java/Tomcat Logger slows down performance
 much higher than read-only I/O of Lucene.



 Fuad Efendi
 +1 416-993-2060
 http://www.linkedin.com/in/liferay

 Tokenizer Inc.
 http://www.tokenizer.ca/
 Data Mining, Vertical Search







Re: One more happy Solr user ...

2009-10-14 Thread Andrew McCombe
Hi
Nice site.  First search I tried was for 'italien' in 'Mumbai' which
returned zero results.   Are you using spellcheck suggestions?

Apart from that it's nice and fast.

Regards
Andrew McCombe
iWebsolutions.co.uk


2009/10/14 Avlesh Singh avl...@gmail.com

 I am pleased to announce the latest release of a popular Indian local
 search
 portal called http://www.burrp.com http://mumbai.burrp.com.
 In prior versions of this web application, search was Lucene driven and we
 had to write our own implementation of search facets amongst other
 painful
 tasks.

 I can't be happier to inform everyone on this list that search/suggest
 features on the burrp! site are now powered by Solr.
 Please use it and let me know if we can make it better.

 Very soon, I'll be back to report another usage of Solr (a grand one by
 scale).
 Thank you Solr developers.

 Cheers
 Avlesh



A little help with indexing joined words

2009-10-05 Thread Andrew McCombe
Hi
I am hoping someone can point me in the right direction with regards to
indexing words that are concatenated together to make other words or product
names.

We have indexed a product database and have come across some search terms
where zero results are returned.  There are products in the index with
'Borderlands xxx xxx', 'Dragonfly xx xxx' in the title.  Searches for
'Borderland'  or 'Border Land' and 'Dragon Fly' return zero results
respectively.

Where do I look to resolve this?  The product name field is indexed using a
text field type.

Thanks in advance
Andrew


Best approach to multiple languages

2009-07-22 Thread Andrew McCombe
Hi

We have a dataset that contains productname, category and descriptions.  The
descriptions can be in one or more different languages.  What would be the
recommended way of indexing these?

My initial thoughts are to index each description as a separate field and
append the language identifier to the field name, for example, three fields
with description_en, description_de, descrtiption_fr.  Is this the best
approach or is there a better way?

Regards
Andrew McCombe


Re: Best approach to multiple languages

2009-07-22 Thread Andrew McCombe
Hi

We will  know the user's language choice before searching.

Regards
Andrew

2009/7/22 Grant Ingersoll gsing...@apache.org

 How do you want to search those descriptions?  Do you know the query
 language going in?


 On Jul 22, 2009, at 6:12 AM, Andrew McCombe wrote:

  Hi

 We have a dataset that contains productname, category and descriptions.
  The
 descriptions can be in one or more different languages.  What would be the
 recommended way of indexing these?

 My initial thoughts are to index each description as a separate field and
 append the language identifier to the field name, for example, three
 fields
 with description_en, description_de, descrtiption_fr.  Is this the best
 approach or is there a better way?

 Regards
 Andrew McCombe


 --
 Grant Ingersoll
 http://www.lucidimagination.com/

 Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
 Solr/Lucene:
 http://www.lucidimagination.com/search




Re: FATAL: Solr returned an error: Invalid_Date_String

2009-07-21 Thread Andrew McCombe
Hi

Dates must be in ISO 8601 format:

http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html

e.g 1995-12-31T23:59:59Z

Hope this helps

Andrew McCombe


2009/7/21 Mick England mic...@mac.com


 Hi,

 I have the following tag in my xml files:

 field name=timestamp2009-05-06/field

 When I try posting the file I get this error:

 FATAL: Solr returned an error: Invalid_Date_String20090506

 My schema.xml file has this:

   field name=timestamp type=date indexed=true stored=true
 default=NOW multiValued=false/

 How do I specify a correct date string?

 --
 View this message in context:
 http://www.nabble.com/FATAL%3A-Solr-returned-an-error%3A-Invalid_Date_String-tp24594686p24594686.html
 Sent from the Solr - User mailing list archive at Nabble.com.




1.4 stable release date

2009-07-02 Thread Andrew McCombe
Hi

Just wondering if there is a release date for 1.4 stable?

Regards
Andrew


Integetr field showing as boolean, breaking phps writer

2009-05-15 Thread Andrew McCombe
Hello

I have a field defined in schema.xml as an integer which should
contain either 0,1,2,10 or 11 values but my results documents are
showing this as either 'true' or 'false'. the majority of the half
million documents have this field as 0 or 1 but around 6,000  have it
as 2,10 or 11.  The field is stored but not indexed.

Also, I am using the PHPS responsewriter and I am finding that
unserializing the results in php fails.  I suspect that this is
because the output is defining the field as an integer but then
showing false:

 e.g,  ...;s:11:recommended;i:false;

Does anyone know why my field is stored as boolean when it should be an integer?

Regards
Andrew


Re: Delete documents from index with dataimport

2009-05-14 Thread Andrew McCombe
Hi

Yes I'd like the document deleted from Solr and yes, there is a unique
document id field in Solr.

Regards
Andrew

Andrew

2009/5/13 Fergus McMenemie fer...@twig.me.uk:
Hi

Is it possible, through dataimport handler to remove an existing
document from the Solr index?

I import/update from my database where the active field is true.
However, if the client then set's active to false, the document stays
in the Solr index and doesn't get removed.

Regards
Andrew

 Yes but only in the latest trunk. If your active field is false
 do you want to see the document deleted? Do you have another field
 which is a unique ID for the document?

 Fergus
 --

 ===
 Fergus McMenemie               Email:fer...@twig.me.uk
 Techmore Ltd                   Phone:(UK) 07721 376021

 Unix/Mac/Intranets             Analyst Programmer
 ===



Re: Who is running 1.4 nightly in production?

2009-05-13 Thread Andrew McCombe
We are using a nightly from 13/04.  I've found one issue with the PHP
ResponseWriter but apart from that it has been pretty solid.

I'm using the bundled Jetty server to run it for the moment but hope
to move to Tomcat once released and stable (and I have learned
Tomcat!).

Andrew


2009/5/12 Walter Underwood wunderw...@netflix.com:
 We're planning our move to 1.4, and want to run one of our production
 servers with the new code. Just to feel better about it, is anyone else
 running 1.4 in production?

 I'm building 2009-05-11 right now.

 wuner




Delete documents from index with dataimport

2009-05-13 Thread Andrew McCombe
Hi

Is it possible, through dataimport handler to remove an existing
document from the Solr index?

I import/update from my database where the active field is true.
However, if the client then set's active to false, the document stays
in the Solr index and doesn't get removed.

Regards
Andrew


STop dataimport full-import

2009-05-11 Thread Andrew McCombe
Hi

Is it possible to stop a full-import from a dataimport handler and if so, how?

If I stop the import or stop Jetty and restart it whilst the
full-import is taking place, will it delete the indexed data?

Thanks in Advance
Andrew


Re: STop dataimport full-import

2009-05-11 Thread Andrew McCombe
Hi

Thanks.  Found out the hard way that abort also removes the index :)

Regards
Andrew



2009/5/11 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 you can abort a running import with command=abort

 if you kill the jetty in between Lucene would commit the uncommitted docs

 On Mon, May 11, 2009 at 3:13 PM, Andrew McCombe eupe...@gmail.com wrote:
 Hi

 Is it possible to stop a full-import from a dataimport handler and if so, 
 how?

 If I stop the import or stop Jetty and restart it whilst the
 full-import is taking place, will it delete the indexed data?

 Thanks in Advance
 Andrew




 --
 -
 Noble Paul | Principal Engineer| AOL | http://aol.com



Spellcheck.build

2009-05-05 Thread Andrew McCombe
Hi

I have imported/indexed around half a million rows from my database
into solr and then rebuilt the spellchecker.  I've also setup the
delta-import to handle and  new or changed rows from the database.  Do
I need to rebuild the spellchecker each time I run the delta-import?

Regards
Andrew


Filter query with wildcard, fq=a*

2009-04-30 Thread Andrew McCombe
Hi

I have half a million records indexed and need to filter results on a term
by the first letter.  For Example the search term is 'I love' and returns a
few thousand records.  I need to filter these results for all artists
beginning with 'A'.

I've tried:
'fq=artistText:A*'

But then get no results.

Can someone point me in the right direction please?

Regards
Andrew


What is QTime a measure of?

2009-04-06 Thread Andrew McCombe
Hi

Just started using Solr/Lucene and am getting to grips with it.  Great
product!

What is the QTime a measure of?  is it milliseconds, seconds?  I tried a
Google search but couldn't fins anything definitive.

Thanks In Advance

Andrew McCombe