date:20130523

Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar

Re: fq facet on double and non-indexed field

bq: So cant we do fq on non-indexed field

No. By definition the fq clause is a search and
you can only search on indexed fields.

Best
Erick

On Wed, May 22, 2013 at 5:08 PM, gpssolr2020 psgoms...@gmail.com wrote:
 Hi

 i am trying to apply filtering on non-indexed double field .But its not
 returning  any results. So cant we do fq on non-indexed field?

 can not use FieldCache on a field which is neither indexed nor has doc
 values: EXCH_RT_AMT
 /str
 int name=code400/int

 We are using Solr4.2.

 Thanks.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/fq-facet-on-double-and-non-indexed-field-tp4065457.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Approach to apply full index from master to slaves?

What's your max warming searcher value?

About warming queries, that may be _adding_ to your problem.
I'd first try removing many of them, especially if you have
your cache autowarm settings very high, try 16 or so.

Autowarming is all about pre-loading the caches etc, but you
reach diminishing returns pretty quickly.

And what are all the threads doing?

Best
Erick

On Wed, May 22, 2013 at 11:14 PM, William Bell billnb...@gmail.com wrote:
 We have a 3GB index. We index on the master and then replicate to the
 slaves.

 But the issue is that after the slaves switch over - we get deadlocking, #
 of threads increase to 500, and most times the SOLR instance just plain
 locks up.

 We tried adding a bunch of warming queries, but we still have a major
 performance hit and same issues.

 Are there any other tweaks and recommendations? Are others experiencing
 this?

 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076

hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland

I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added or 
a commit is performed, but the doc(id) does not seem to be included for the 
commit-hooks (naturally I guess):

A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or postSoftCommit

The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.

Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr

Re: search filter

2013-05-23 Thread Gora Mohanty

On 23 May 2013 11:19, Kamal Palei palei.ka...@gmail.com wrote:
 HI Rafał Kuć
 I tried fq=Salary:[5+TO+10]+OR+Salary:0 and as well as fq=Salary:[5 TO 10]
 OR Salary:0  both, both the cases I retrieved 0 results.
[...]

Please try the suggested filter query from the
Solr admin. interface, or by typing it directly
into the browser URL bar. My guess is that
there is still some issue with your Drupal/Solr
integration.

Regards,
Gora

Re: OPENNLP current patch compiling problem for 4.x branch

by definition, there is no LUCENE_44 constant in a 4.3
distro! Just change it to LUCENE_43 (or whatever you find
in the Version class that suits your needs) or try this on a
4.x checkout.

Best
Erick

On Thu, May 23, 2013 at 2:08 AM, Patrick Mi
patrick...@touchpointgroup.com wrote:
 Hi,

 I checked out from here
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
 downloaded the latest patch LUCENE-2899-current.patch.

 Applied the patch ok but when I did 'ant compile' I got the following error:


 ==
 [javac]
 /home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
 nalysis/opennlp/FilterPayloadsFilter.java:43: error
 r: cannot find symbol
 [javac] super(Version.LUCENE_44, input);
 [javac]  ^
 [javac]   symbol:   variable LUCENE_44
 [javac]   location: class Version
 [javac] 1 error
 ==

 Compiled it on trunk without problem.

 Is this patch supposed to work for 4.X?

 Regards,
 Patrick

Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

Tangential to the issue you raise is that this is a huge tlog. It indicates that
you aren't doing a hard commit (openSearcher=false) very often. That
operation will truncate your tlog which should speed recovery/startup.
You're also chewing up some memory with a tlog that size since pointers
to the tlog are kept for each document.

This comment doesn't address your comment about the change to
ZkController, I'll leave that to someone who knows the code.

Best
Erick

On Thu, May 23, 2013 at 3:14 AM, AlexeyK lex.kudi...@gmail.com wrote:
 a small change: it's not an endless loop, but a painfully slow processing
 which includes running a delete query and then insertion. Each document from
 the tlog takes tens of seconds to process (more than 100 times slower than
 during normal insertion process)



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065551.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: hook to know when a DOC is committed.

A poller really is the most sensible, practical, and easiest route to go. If 
you add the versions=true parameter to your update request and have the 
transaction log enabled the update response will have the version numbers 
for each document id, then the poller can also tell if an update has been 
committed as well.


Also, with soft commit, documents should be visible must more rapidly.

Do you have some other, unmentioned requirement that you feel is biasing you 
against a sensible poller? Clue us in as to the nature of such a 
requirement.


-- Jack Krupansky

-Original Message- 
From: Fredrik Rødland

Sent: Thursday, May 23, 2013 7:53 AM
To: solr-user@lucene.apache.org
Subject: hook to know when a DOC is committed.

I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added 
or a commit is performed, but the doc(id) does not seem to be included for 
the commit-hooks (naturally I guess):


A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or 
postSoftCommit


The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.


Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
 Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr

Re: Solr DIH - Small index still take time?

2013-05-23 Thread Alexandre Rafalovitch

That should work. Just watch out for (set value of)
preImportDeleteQuery. Otherwise, when you do full import you may
accidentally delete items from the other set.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 23, 2013 at 6:25 AM, Spadez james_will...@hotmail.com wrote:
 Hi,

 This is the situation, I have two sources of data in my dataimport handler,
 one is huge, the other is tiny:

 Source A: 10-20 records
 Source B: 50,000,000 records

 I was wondering what happens if I was to do a DIH just on Source A every 10
 mins, and only run the DIH on source B every 24 hours.

 Would running my DIH on Source A be extremely quick, because the data we are
 importing is small, or would it still be time consuming, because it would
 have to rebuild the index of the entire SOLR (i.e 50,000,010 records).

 Thank you!



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-DIH-Small-index-still-take-time-tp4065582.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: index multiple files into one index entity

I just skimmed your post, but I'm responding to the last bit.

If you have uniqueKey defined as id in schema.xml then
no, you cannot have multiple documents with the same ID.
Whenever a new doc comes in it replaces the old doc with that ID.

You can remove the uniqueKey definition and do what you want,
but there are very few Solr installations with no uniqueKey and
it's probably a better idea to make your id's truly unique.

Best
Erick

On Thu, May 23, 2013 at 6:14 AM,  mark.ka...@t-systems.com wrote:
 Hello solr team,

 I want to index multiple fields into one solr index entity, with the same id. 
 We are using solr 4.1


 I try it with following source fragment:

 public void addContentSet(ContentSet contentSet) throws 
 SearchProviderException {

 ...

 ContentStreamUpdateRequest csur = 
 generateCSURequest(contentSet.getIndexId(), contentSet);
 String indexId = contentSet.getIndexId();

 ConcurrentUpdateSolrServer server = 
 serverPool.getUpdateServer(indexId);
 server.request(csur);

 ...
 }

 private ContentStreamUpdateRequest generateCSURequest(String indexId, 
 ContentSet contentSet)
 throws IOException {
 ContentStreamUpdateRequest csur = new 
 ContentStreamUpdateRequest(confStore.getExtractUrl());

 ModifiableSolrParams parameters = csur.getParams();
 if (parameters == null) {
 parameters = new ModifiableSolrParams();
 }

 parameters.set(literalsOverride, false);

 // maps the tika default content attribute to the Attribute with name 
 'fulltext'
 parameters.set(fmap.content, 
 SearchSystemAttributeDef.FULLTEXT.getName());
 // create an empty content stream, this seams necessary for 
 ContentStreamUpdateRequest
 csur.addContentStream(new ImaContentStream());

 for (Content content : contentSet.getContentList()) {
 csur.addContentStream(new ImaContentStream(content));
 // for each content stream add additional attributes
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_ID.getName(), 
 content.getBinaryObjectId().toString());
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_KEY.getName(), content.getContentKey());
 parameters.add(literal. + 
 SearchSystemAttributeDef.FILE_NAME.getName(), content.getContentName());
 parameters.add(literal. + 
 SearchSystemAttributeDef.MIME_TYPE.getName(), content.getMimeType());
 }

 parameters.set(literal.id , indexId);

 // adding some other attributes
 ...

 csur.setParams(parameters);

 return csur;
 }

 During debugging I can see that the method 'server.request(csur)' read for 
 each ImaContentStream the buffer.
 When I'm looking on solr catalina log I see that the attached files reach the 
 solr servlet.

 INFO: Releasing directory:/data/V-4-1/master0/data/index
 Apr 25, 2013 5:48:07 AM org.apache.solr.update.processor.LogUpdateProcessor 
 finish
 INFO: [master0] webapp=/solr-4-1 path=/update/extract 
 params={literal.searchconnectortest15_c8150e41_cc49_4a .. 
 literal.id=26afa5dc-40ad-442a-ac79-0e7880c06aa1 .
 {add=[26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910940958720), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910971367424), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910976610304), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910983950336), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910989193216), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910995484672)]} 0 58


 But only the latest in the content list will be indexed.


 My schema.xml has the following field definitions:

 field name=id type=string indexed=true stored=true 
 required=true /
 field name=content type=text_general indexed=false stored=true 
 multiValued=true/

 field name=contentkey type=string indexed=true stored=true 
 multiValued=true/
 field name=contentid type=string indexed=true stored=true 
 multiValued=true/
 field name=contentfilename  type=string indexed=true stored=true 
 multiValued=true/
 field name=contentmimetype type=string indexed=true stored=true 
 multiValued=true/

 field name=fulltext type=text_general indexed=true stored=true 
 multiValued=true/


 I'm using the tika ExtractingRequestHandler which can extract binary files.



   requestHandler name=/update/extract
   startup=lazy
   class=solr.extraction.ExtractingRequestHandler 
 lst name=defaults
   str name=lowernamestrue/str
   str name=uprefixignored_/str

   !-- capture link hrefs but ignore div attributes --
   str name=captureAttrtrue/str
   str name=fmap.alinks/str
   str name=fmap.divignored_/str

 /lst
   /requestHandler

 Is it possible to index multiple files with the same id?
 It is necessary to implement my own

Re: hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland

On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

 A poller really is the most sensible, practical, and easiest route to go. If 
 you add the versions=true parameter to your update request and have the 
 transaction log enabled the update response will have the version numbers for 
 each document id, then the poller can also tell if an update has been 
 committed as well.

The poller will still have to retry before advertising a doc as searchable - 
won't it?

 Do you have some other, unmentioned requirement that you feel is biasing you 
 against a sensible poller? Clue us in as to the nature of such a requirement.

My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.

Our system consist of approx 10 indexes and 8 replications of each of these, so 
keeping track of all these by pollers would require a whole bunch of logic.  
Having a pushed-based system would facilitate knowing where  when a document 
is searchable quite a lot.



regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr

Re: fq facet on double and non-indexed field

2013-05-23 Thread gpssolr2020

Thanks Erick..


i  hope we cant do q also on non-indexed field.

Whats is the difference between q and fq other than cache .



Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/fq-facet-on-double-and-non-indexed-field-tp4065457p4065604.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler



Hi,

in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue 
loading the MySQL driver from the [instance]/lib dir:


Caused by: java.lang.ClassNotFoundException: 
org.apache.solr.handler.dataimport.DataImportHandler

 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:266)
 at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)

... 18 more

To narrow it down, I use the plain example configuration with the 
following changes:


- Add a dataimport requestHandler to example/conf/solrconfig.xml
  (copied from a working solr 3.6.x)
- Created example/conf/data-config.xml with
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver ...
  and SQL statement (both copied from a working solr 3.6.x)
- placed the current driver mysql-connector-java-5.1.25-bin.jar in
  example/lib

As to my knowledge the lib dir is included automatically to the path. To 
make sure I tried to:


- add lib dir=./lib / to explicit to solrconf.xml
- add absolute path to solrarconf.xml
- changed solr.xml to use solr persistent=true sharedLib=lib

All to no avail.

System Info:
- OpenJDK Runtime Environmentm 1.7.0_19
- Solr 4.3.0
- mysql-connector-java-5.1.25-bin.jar

The same configuration run fine with a solr 3.6.x on the very same machine.

Any help is appreciated!
Cheers
Chris



--
Christian Köhler

Re: hook to know when a DOC is committed.

Yes, by definition, a poller retries. But by picking a sensible default for 
initial poll and retry (possibly an initial delay tuned to match average 
update/commit time) couple with a traditional exponential backoff, that 
should not be a problem at all. In other words, an average request would not 
require a retry.


Even so, do you feel that there is some sort of problem with retry? If so, 
please state what it is.


Again, if you utilize soft commit, the time to commit will be significantly 
reduced.


Or, just go ahead a force a commit on every commit here the delay of a poll 
request is not acceptable. But I'd recommend the tuned poller.


would require a whole bunch of logic - and you think the commit hooks and 
your push model implementation (on both Solr and client side) will be less 
logic?!!


-- Jack Krupansky

-Original Message- 
From: Fredrik Rødland

Sent: Thursday, May 23, 2013 8:18 AM
To: solr-user@lucene.apache.org
Subject: Re: hook to know when a DOC is committed.

On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

A poller really is the most sensible, practical, and easiest route to go. 
If you add the versions=true parameter to your update request and have 
the transaction log enabled the update response will have the version 
numbers for each document id, then the poller can also tell if an update 
has been committed as well.


The poller will still have to retry before advertising a doc as searchable - 
won't it?


Do you have some other, unmentioned requirement that you feel is biasing 
you against a sensible poller? Clue us in as to the nature of such a 
requirement.


My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.


Our system consist of approx 10 indexes and 8 replications of each of these, 
so keeping track of all these by pollers would require a whole bunch of 
logic.  Having a pushed-based system would facilitate knowing where  when a 
document is searchable quite a lot.




regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
 Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr

Bug in spellcheck.alternativeTermCount

2013-05-23 Thread Rounak Jain

I was playing around with spellcheck.alternativeTermCount and noticed that
if it is set to zero, Solr gives an exception with certain queries. Maybe
the value isn't supposed to be zero, but I don't think an exception is the
expected behaviour.

Rounak

Restaurant availability from database

2013-05-23 Thread rajh

Hi,

I am are building a website that lists restaurant information and I also
like to include the availability information.

I've created a custom ValueSourceParser and ValueSource that retrieve the
availability information from a MySQL database. An example query is as
follows.

http://localhost:8983/solr/collection1/select?q=restaurant_id:*fl=*,available:availability(2013-05-23,
2, 1700, 2359)

This results in a psuedo (boolean) field available per document result and
this works as expected. But my problem is that I also need the total number
of available restaurants.

Is there a way to count the number of available restaurants over the whole
result set? I tried the stats component, but it doesn't seem to work with
pseudo fields.

Thanks in advance,

Ronald





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4.3 fails to load MySQL driver

Check the Solr log on startup - it will explicitly state which lib 
directories/files will be used. Make sure they agree with where the DIH jars 
reside. Keep in mind that the directory structure of Solr changed - use the 
lib from 4.3 solrconfig.


Try to use DIH in the standard Solr 4.3 example first. Then mimic that in 
your customization.


-- Jack Krupansky

-Original Message- 
From: Christian Köhler

Sent: Thursday, May 23, 2013 8:25 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.3 fails to load MySQL driver


Hi,

in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue
loading the MySQL driver from the [instance]/lib dir:

Caused by: java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:266)
 at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)
... 18 more

To narrow it down, I use the plain example configuration with the
following changes:

- Add a dataimport requestHandler to example/conf/solrconfig.xml
  (copied from a working solr 3.6.x)
- Created example/conf/data-config.xml with
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver ...
  and SQL statement (both copied from a working solr 3.6.x)
- placed the current driver mysql-connector-java-5.1.25-bin.jar in
  example/lib

As to my knowledge the lib dir is included automatically to the path. To
make sure I tried to:

- add lib dir=./lib / to explicit to solrconf.xml
- add absolute path to solrarconf.xml
- changed solr.xml to use solr persistent=true sharedLib=lib

All to no avail.

System Info:
- OpenJDK Runtime Environmentm 1.7.0_19
- Solr 4.3.0
- mysql-connector-java-5.1.25-bin.jar

The same configuration run fine with a solr 3.6.x on the very same machine.

Any help is appreciated!
Cheers
Chris



--
Christian Köhler

Shardsplitting

2013-05-23 Thread Arkadi Colson


Hi

When having a collection with 3 shards en 2 replica's for each shard and 
I want to split shard1. Does it matter where to start the splitshard 
command in the cloud or should it be started on the master of that shard?


BR,
Arkadi

Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread Jan Høydahl

Huge tlogs seems to be a common problem. Should we make it flush automatically 
on huge file size? Could be configurable on the updateLog tag?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

23. mai 2013 kl. 14:03 skrev Erick Erickson erickerick...@gmail.com:

 Tangential to the issue you raise is that this is a huge tlog. It indicates 
 that
 you aren't doing a hard commit (openSearcher=false) very often. That
 operation will truncate your tlog which should speed recovery/startup.
 You're also chewing up some memory with a tlog that size since pointers
 to the tlog are kept for each document.
 
 This comment doesn't address your comment about the change to
 ZkController, I'll leave that to someone who knows the code.
 
 Best
 Erick
 
 On Thu, May 23, 2013 at 3:14 AM, AlexeyK lex.kudi...@gmail.com wrote:
 a small change: it's not an endless loop, but a painfully slow processing
 which includes running a delete query and then insertion. Each document from
 the tlog takes tens of seconds to process (more than 100 times slower than
 during normal insertion process)
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065551.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Regular expression in solr

2013-05-23 Thread Erik Hatcher

Regex expressions work on individual terms.  Positional information is 
irrelevant when it comes to regex matching - it's not matching across terms*.

The syntax allowed is documented here 
https://lucene.apache.org/core/4_3_0/core/org/apache/lucene/util/automaton/RegExp.html
 - it's not quite the full standard syntax.  ^ and $ aren't mentioned there.  
The beginning of the regex implicitly starts at the beginning of the term.

So whatever constitutes a term is the granularity of what matches.  string 
fields operate on the entire string.  A text field that is analyzed will regex 
match on the individual terms that emerge from the index-time analysis process.

Erik

* Though with the surround query parser you can do proximity matching using 
wildcarded terms in sophisticated ways.

On May 22, 2013, at 16:42 , Lance Norskog wrote:

 If the indexed data includes positions, it should be possible to implement ^ 
 and $ as the first and last positions.
 
 On 05/22/2013 04:08 AM, Oussama Jilal wrote:
 There is no ^ or $ in the solr regex since the regular expression will match 
 tokens (not the complete indexed text). So the results you get will basicly 
 depend on your way of indexing, if you use the regex on a tokenized field 
 and that is not what you want, try to use a copy field wich is not tokenized 
 and then use the regex on that one.
 
 On 05/22/2013 11:53 AM, Stéphane Habett Roux wrote:
 I just can't get the $ endpoint to work.
 
 I am not sure but I heard it works with the Java Regex engine (a little 
 obvious if it is true ...), so any Java regex tutorial would help you.
 
 On 05/22/2013 11:42 AM, Sagar Chaturvedi wrote:
 Yes, it works for me too. But many times result is not as expected. Is 
 there some guide on use of regex in solr?
 
 -Original Message-
 From: Oussama Jilal [mailto:jilal.ouss...@gmail.com]
 Sent: Wednesday, May 22, 2013 4:00 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Regular expression in solr
 
 I don't think so, it always worked for me without anything special, just 
 try it and see :)
 
 On 05/22/2013 11:26 AM, Sagar Chaturvedi wrote:
 @Oussama Thank you for your reply. Is it as simple as that? I mean no 
 additional settings required?
 
 -Original Message-
 From: Oussama Jilal [mailto:jilal.ouss...@gmail.com]
 Sent: Wednesday, May 22, 2013 3:37 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Regular expression in solr
 
 You can write a regular expression query like this (you need to specify 
 the regex between slashes / ) :
 
 fieldName:/[rR]egular.*/
 
 On 05/22/2013 10:51 AM, Sagar Chaturvedi wrote:
 Hi,
 
 How do we search based upon regular expressions in solr?
 
 Regards,
 Sagar
 
 
 
 DISCLAIMER:
 - 
 -
 -
 The contents of this e-mail and any attachment(s) are confidential
 and intended for the named recipient(s) only.
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in this email are solely
 those of the author and may not necessarily reflect the opinions of
 NEC or its affiliates.
 Any form of reproduction, dissemination, copying, disclosure,
 modification, distribution and / or publication of this message
 without the prior written consent of the author of this e-mail is
 strictly prohibited. If you have received this email in error please
 delete it and notify the sender immediately. .
 - 
 -
 -
 
 DISCLAIMER:
 -- 
 -
 The contents of this e-mail and any attachment(s) are confidential and
 intended for the named recipient(s) only.
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in this email are solely
 those of the author and may not necessarily reflect the opinions of
 NEC or its affiliates.
 Any form of reproduction, dissemination, copying, disclosure,
 modification, distribution and / or publication of this message
 without the prior written consent of the author of this e-mail is
 strictly prohibited. If you have received this email in error please
 delete it and notify the sender immediately. .
 -- 
 -
 
 
 DISCLAIMER:
 ---
  
 The contents of this e-mail and any attachment(s) are confidential and
 intended
 for the named recipient(s) only.
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in
 this email are solely those of the author and may not necessarily reflect 
 the

AW: index multiple files into one index entity

2013-05-23 Thread Mark.Kappe

Hello Erick,
Thank you for your fast answer.

Maybe I don't exclaim my question clearly.
I want index many files to one index entity. I will use the same behavior as 
any other multivalued field which can indexed to one unique id.
So I think every ContentStreamUpdateRequest represent one index entity, isn't 
it? And with each addContentStream I will add one File to this entity.

Thank you and with best Regards
Mark




-Ursprüngliche Nachricht-
Von: Erick Erickson [mailto:erickerick...@gmail.com] 
Gesendet: Donnerstag, 23. Mai 2013 14:11
An: solr-user@lucene.apache.org
Betreff: Re: index multiple files into one index entity

I just skimmed your post, but I'm responding to the last bit.

If you have uniqueKey defined as id in schema.xml then no, you cannot have 
multiple documents with the same ID.
Whenever a new doc comes in it replaces the old doc with that ID.

You can remove the uniqueKey definition and do what you want, but there are 
very few Solr installations with no uniqueKey and it's probably a better idea 
to make your id's truly unique.

Best
Erick

On Thu, May 23, 2013 at 6:14 AM,  mark.ka...@t-systems.com wrote:
 Hello solr team,

 I want to index multiple fields into one solr index entity, with the 
 same id. We are using solr 4.1


 I try it with following source fragment:

 public void addContentSet(ContentSet contentSet) throws 
 SearchProviderException {

 ...

 ContentStreamUpdateRequest csur = 
 generateCSURequest(contentSet.getIndexId(), contentSet);
 String indexId = contentSet.getIndexId();

 ConcurrentUpdateSolrServer server = 
 serverPool.getUpdateServer(indexId);
 server.request(csur);

 ...
 }

 private ContentStreamUpdateRequest generateCSURequest(String indexId, 
 ContentSet contentSet)
 throws IOException {
 ContentStreamUpdateRequest csur = new 
 ContentStreamUpdateRequest(confStore.getExtractUrl());

 ModifiableSolrParams parameters = csur.getParams();
 if (parameters == null) {
 parameters = new ModifiableSolrParams();
 }

 parameters.set(literalsOverride, false);

 // maps the tika default content attribute to the Attribute with name 
 'fulltext'
 parameters.set(fmap.content, 
 SearchSystemAttributeDef.FULLTEXT.getName());
 // create an empty content stream, this seams necessary for 
 ContentStreamUpdateRequest
 csur.addContentStream(new ImaContentStream());

 for (Content content : contentSet.getContentList()) {
 csur.addContentStream(new ImaContentStream(content));
 // for each content stream add additional attributes
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_ID.getName(), 
 content.getBinaryObjectId().toString());
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_KEY.getName(), content.getContentKey());
 parameters.add(literal. + 
 SearchSystemAttributeDef.FILE_NAME.getName(), content.getContentName());
 parameters.add(literal. + 
 SearchSystemAttributeDef.MIME_TYPE.getName(), content.getMimeType());
 }

 parameters.set(literal.id , indexId);

 // adding some other attributes
 ...

 csur.setParams(parameters);

 return csur;
 }

 During debugging I can see that the method 'server.request(csur)' read for 
 each ImaContentStream the buffer.
 When I'm looking on solr catalina log I see that the attached files reach the 
 solr servlet.

 INFO: Releasing directory:/data/V-4-1/master0/data/index
 Apr 25, 2013 5:48:07 AM 
 org.apache.solr.update.processor.LogUpdateProcessor finish
 INFO: [master0] webapp=/solr-4-1 path=/update/extract 
 params={literal.searchconnectortest15_c8150e41_cc49_4a .. 
 literal.id=26afa5dc-40ad-442a-ac79-0e7880c06aa1 .
 {add=[26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910940958720), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910971367424), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910976610304), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910983950336), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910989193216), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910995484672)]} 0 58


 But only the latest in the content list will be indexed.


 My schema.xml has the following field definitions:

 field name=id type=string indexed=true stored=true 
 required=true /
 field name=content type=text_general indexed=false 
 stored=true multiValued=true/

 field name=contentkey type=string indexed=true stored=true 
 multiValued=true/
 field name=contentid type=string indexed=true stored=true 
 multiValued=true/
 field name=contentfilename  type=string indexed=true stored=true 
 multiValued=true/
 field name=contentmimetype type=string indexed=true 
 stored=true multiValued=true/

 field name=fulltext type=text_general

Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread AlexeyK

the hard commit is set to about 20 minutes, while ram buffer is 256Mb. 
We will add more frequent hard commits without refreshing the searcher, that
for the tip.

from what I understood from the code, for each 'add' command there is a test
for a 'delete by query'. if there is an older dbq, it's run after the 'add'
operation if its version  'add' version.
in my case, there are a lot of documents to be inserted, and a single large
DBQ. My question is: shouldn't this be done in bulks? Why is it necessary to
run the DBQ after each insertion? Supposedly there are 1000 insertions it's
run 1000 times.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065628.html
Sent from the Solr - User mailing list archive at Nabble.com.

Broken pipe

2013-05-23 Thread Arkadi Colson


Any idea why I got a Broken pipe?

INFO  - 2013-05-23 13:37:19.881; org.apache.solr.core.SolrCore; 
[messages_shard3_replica1] webapp=/solr path=/select/ 
params={sort=score+descfl=id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_description_ngram,smsc_content,smsc_content_ngram,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subjectdebugQuery=truedefaultOperator=ANDindent=onstart=0q=(smsc_content:banaan+||+smsc_content_ngram:banaan+||+smsc_description:banaan+||+smsc_description_ngram:banaan)+%26%26+(smsc_lastdate:[2000-04-23T15:14:40Z+TO+2013-05-23T15:14:40Z])+%26%26+(smsc_ssid:9)collection=messageswt=xmlrows=50version=2.2} 
hits=119 status=0 QTime=81108
ERROR - 2013-05-23 13:37:19.892; org.apache.solr.common.SolrException; 
null:ClientAbortException: java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)

at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at 
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at 
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at 
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
at 
org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:85)
at 
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:41)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:644)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:372)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at 
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)

at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:366)
at 
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:240)
at 
org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:117)
at 
org.apache.coyote.http11.AbstractOutputBuffer.doWrite(AbstractOutputBuffer.java:192)

at org.apache.coyote.Response.doWrite(Response.java:505)
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:401)

... 30 more

ERROR - 2013-05-23 13:37:19.893; org.apache.solr.common.SolrException; 
null:ClientAbortException: java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)

at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at 
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at 
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at

Re: Distributed query: strange behavior.

2013-05-23 Thread Shawn Heisey

On 5/23/2013 1:51 AM, Luis Cappa Banda wrote:
 I've query each Solr shard server one by one and the total number of
 documents is correct. However, when I change rows parameter from 10 to 100
 the total numFound of documents change:

I've seen this problem on the list before and the cause has been
determined each time to be caused by documents with the same uniqueKey
value appearing in more than one shard.

What I think happens here:

With rows=10, you get the top ten docs from each of the three shards,
and each shard sends its numFound for that query to the core that's
coordinating the search.  The coordinator adds up numFound, looks
through those thirty docs, and arranges them according to the requested
sort order, returning only the top 10.  In this case, there happen to be
no duplicates.

With rows=100, you get a total of 300 docs.  This time, duplicates are
found and removed by the coordinator.  I think that the coordinator
adjusts the total numFound by the number of duplicate documents it
removed, in an attempt to be more accurate.

I don't know if adjusting numFound when duplicates are found in a
sharded query is the right thing to do, I'll leave that for smarter
people.  Perhaps Solr should return a message with the results saying
that duplicates were found, and if a config option is not enabled, the
server should throw an exception and return a 4xx HTTP error code.  One
idea for a config parameter name would be allowShardDuplicates, but
something better can probably be found.

Thanks,
Shawn

AW: Broken pipe

This usually happens when the client sending the request to Solr has given up 
waiting for the response (terminated the connection).

In your example, we see that the Solr query time is 81 seconds. Probably the 
client issuing the request has a time-out of maybe 30 or 60 seconds.

André


Von: Arkadi Colson [ark...@smartbit.be]
Gesendet: Donnerstag, 23. Mai 2013 15:40
An: solr-user@lucene.apache.org
Betreff: Broken pipe

Any idea why I got a Broken pipe?

INFO  - 2013-05-23 13:37:19.881; org.apache.solr.core.SolrCore;
[messages_shard3_replica1] webapp=/solr path=/select/
params={sort=score+descfl=id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_description_ngram,smsc_content,smsc_content_ngram,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subjectdebugQuery=truedefaultOperator=ANDindent=onstart=0q=(smsc_content:banaan+||+smsc_content_ngram:banaan+||+smsc_description:banaan+||+smsc_description_ngram:banaan)+%26%26+(smsc_lastdate:[2000-04-23T15:14:40Z+TO+2013-05-23T15:14:40Z])+%26%26+(smsc_ssid:9)collection=messageswt=xmlrows=50version=2.2}
hits=119 status=0 QTime=81108
ERROR - 2013-05-23 13:37:19.892; org.apache.solr.common.SolrException;
null:ClientAbortException: java.net.SocketException: Broken pipe
 at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
 at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
 at
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
 at
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
 at
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
 at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
 at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
 at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
 at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
 at
org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:85)
 at
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:41)
 at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:644)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:372)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
 at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
 at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
 at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
 at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
 at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
 at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.SocketException: Broken pipe
 at java.net.SocketOutputStream.socketWrite0(Native Method)
 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
 at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
 at
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)
 at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
 at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:366)
 at
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:240)
 at
org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:117)
 at
org.apache.coyote.http11.AbstractOutputBuffer.doWrite(AbstractOutputBuffer.java:192)
 at org.apache.coyote.Response.doWrite(Response.java:505)
 at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:401)
 ... 30 more

ERROR -

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Shawn Heisey

On 5/23/2013 6:25 AM, Christian Köhler wrote:
 in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue
 loading the MySQL driver from the [instance]/lib dir:
 
 Caused by: java.lang.ClassNotFoundException:
 org.apache.solr.handler.dataimport.DataImportHandler

The best thing to do is take the lib directives out of solrconfig.xml
and put your extra jars in ${solr.solr.home}/lib, where solr.solr.home
is the directory where solr.xml lives.  NB: There might be two solr.xml
files in your setup, but if there are, one of them will tell your
servlet container how to start solr, the correct file tells solr about
cores.

Normally, you can set up another global lib directory, absolute or
relative to solr.solr.home, with the sharedLib attribute in solr.xml,
but that doesn't work in 4.3.0 - only ${solr.solr.home}/lib works in
that specific version.  Here's the bug report:

https://issues.apache.org/jira/browse/SOLR-4791

I discovered another glitch last night in the 4.4 development version
and filed a bug report, but I've been informed that I've been doing it
wrong for the last couple of years:

https://issues.apache.org/jira/browse/SOLR-4852

Thanks,
Shawn

Problem with document routing with Solr 4.2.1

Hi All,

I just started indexing data in my brand new Solr Cloud running on 4.2.1.
Since I am a big user of the grouping feature, I need to route my documents on 
the proper shard.
Following the instruction found here:
http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud

I set my document id to something like this  'fieldA!id' where fieldA is the 
key I want to use to distribute my documents.
(All documents with the same value for fieldA will be sent to the same shard).

When I query my index, I can see that the number of documents increase but 
there are no fields at all in the index.

http://10.0.5.211:8201/solr/Current/select?q=*:*

response
  lst name=responseHeader
  int name=status0/int
  int name=QTime11/int
  lst name=params
  str name=q*:*/str
  /lst
  /lst
  result name=response numFound=26318 start=0 maxScore=1.0/
/response

Specifying fields in the 'fl' parameter does nothing.

What am I doing wrong?

Re: Problem with document routing with Solr 4.2.1

2013-05-23 Thread Shalin Shekhar Mangar

That's strange. The default value of rows param is 10 so you should be
getting 10 results back unless your StandardRequestHandler config in
solrconfig has set rows to 0 or if none of your fields are stored.


On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon 
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud

 I set my document id to something like this  'fieldA!id' where fieldA is
 the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same
 shard).

 When I query my index, I can see that the number of documents increase but
 there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?




-- 
Regards,
Shalin Shekhar Mangar.

RE: Bug in spellcheck.alternativeTermCount

2013-05-23 Thread Dyer, James

Can you give instructions on how to reproduce problem?

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Rounak Jain [mailto:rouna...@gmail.com] 
Sent: Thursday, May 23, 2013 7:36 AM
To: solr-user@lucene.apache.org
Subject: Bug in spellcheck.alternativeTermCount

I was playing around with spellcheck.alternativeTermCount and noticed that
if it is set to zero, Solr gives an exception with certain queries. Maybe
the value isn't supposed to be zero, but I don't think an exception is the
expected behaviour.

Rounak

Re: Broken pipe

2013-05-23 Thread Alexandre Rafalovitch

Also happens (same reason) if you are behind a smart load-balance and
it decides to time out and fail over.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 23, 2013 at 9:59 AM, André Widhani andre.widh...@digicol.de wrote:
 This usually happens when the client sending the request to Solr has given up 
 waiting for the response (terminated the connection).

 In your example, we see that the Solr query time is 81 seconds. Probably the 
 client issuing the request has a time-out of maybe 30 or 60 seconds.

 André

 
 Von: Arkadi Colson [ark...@smartbit.be]
 Gesendet: Donnerstag, 23. Mai 2013 15:40
 An: solr-user@lucene.apache.org
 Betreff: Broken pipe

 Any idea why I got a Broken pipe?

 INFO  - 2013-05-23 13:37:19.881; org.apache.solr.core.SolrCore;
 [messages_shard3_replica1] webapp=/solr path=/select/
 params={sort=score+descfl=id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_description_ngram,smsc_content,smsc_content_ngram,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subjectdebugQuery=truedefaultOperator=ANDindent=onstart=0q=(smsc_content:banaan+||+smsc_content_ngram:banaan+||+smsc_description:banaan+||+smsc_description_ngram:banaan)+%26%26+(smsc_lastdate:[2000-04-23T15:14:40Z+TO+2013-05-23T15:14:40Z])+%26%26+(smsc_ssid:9)collection=messageswt=xmlrows=50version=2.2}
 hits=119 status=0 QTime=81108
 ERROR - 2013-05-23 13:37:19.892; org.apache.solr.common.SolrException;
 null:ClientAbortException: java.net.SocketException: Broken pipe
  at
 org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
  at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
  at
 org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
  at
 org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
  at
 org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
  at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
  at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
  at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
  at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
  at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
  at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
  at
 org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:85)
  at
 org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:41)
  at
 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:644)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:372)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
  at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
  at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
  at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
  at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
  at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
  at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
  at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
  at
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
  at
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
  at
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:722)
 Caused by: java.net.SocketException: Broken pipe
  at java.net.SocketOutputStream.socketWrite0(Native Method)
  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
  at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
  at
 org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)
  at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
  at

Re: Restaurant availability from database

2013-05-23 Thread Alexandre Rafalovitch

Check out Gilt's presentation. It might give you some ideas, including
possibly on refactoring your entities around 'availability' as a
document:
http://www.lucenerevolution.org/sites/default/files/Personalized%20Search%20on%20the%20Largest%20Flash%20Sale%20Site%20in%20America.pdf

Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working. (Anonymous - via GTD
book)

On Thu, May 23, 2013 at 8:36 AM, rajh ron...@trimm.nl wrote:
Hi,

I am are building a website that lists restaurant information and I also
like to include the availability information.

I've created a custom ValueSourceParser and ValueSource that retrieve the
availability information from a MySQL database. An example query is as
follows.

http://localhost:8983/solr/collection1/select?q=restaurant_id:*fl=*,available:availability(2013-05-23,
2, 1700, 2359)

This results in a psuedo (boolean) field available per document result and
this works as expected. But my problem is that I also need the total number
of available restaurants.

Is there a way to count the number of available restaurants over the whole
result set? I tried the stats component, but it doesn't seem to work with
pseudo fields.

Thanks in advance,

Ronald

--
View this message in context:
http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Problem with document routing with Solr 4.2.1

I know. If a stop routing the documents and simply use a standard 'id' field 
then I am getting back my fields. 
I forgot to tell you how the collection was created.

http://localhost:8201/solr/admin/collections?action=CREATEname=CurrentnumShards=15replicationFactor=3maxShardsPerNode=9

Since I am using the numshards parameter then composite routing should be 
working... unless I misunderstood something

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: May-23-13 10:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Problem with document routing with Solr 4.2.1

That's strange. The default value of rows param is 10 so you should be 
getting 10 results back unless your StandardRequestHandler config in solrconfig 
has set rows to 0 or if none of your fields are stored.


On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon  
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my 
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+So
 lrCloud

 I set my document id to something like this  'fieldA!id' where fieldA 
 is the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same 
 shard).

 When I query my index, I can see that the number of documents increase 
 but there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/ 
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?




--
Regards,
Shalin Shekhar Mangar.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

Core admin action CREATE fails for existing core

It seems to me that the behavior of the Core admin action CREATE has changed 
when going from Solr 4.1 to 4.3.

With 4.1, I could re-configure an existing core (changing path/name to 
solrconfig.xml for example). In 4.3, I get an error message:

  SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
'core-tex69b6iom1djrbzmlmg83-index2' already exists.

Is this change intended?

André

Re: Shardsplitting

2013-05-23 Thread Shalin Shekhar Mangar

Hi Arkadi,

It does not matter where you invoke that command because ultimately that
command is executed by the Overseer node. That being said, shard splitting
has some bugs whose fixes will be released with Solr 4.3.1 so I'd suggest
that you wait until then to use this feature.


On Thu, May 23, 2013 at 6:09 PM, Arkadi Colson ark...@smartbit.be wrote:

 Hi

 When having a collection with 3 shards en 2 replica's for each shard and I
 want to split shard1. Does it matter where to start the splitshard command
 in the cloud or should it be started on the master of that shard?

 BR,
 Arkadi





-- 
Regards,
Shalin Shekhar Mangar.

Re: OPENNLP current patch compiling problem for 4.x branch

2013-05-23 Thread Steve Rowe

Hi Patrick,

I think you should check out and apply the patch to branch_4x, rather than the 
lucene_solr_4_3_0 tag:

http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x

Steve

On May 23, 2013, at 2:08 AM, Patrick Mi patrick...@touchpointgroup.com wrote:

 Hi,
 
 I checked out from here
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
 downloaded the latest patch LUCENE-2899-current.patch.
 
 Applied the patch ok but when I did 'ant compile' I got the following error:
 
 
 ==
[javac]
 /home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
 nalysis/opennlp/FilterPayloadsFilter.java:43: error
 r: cannot find symbol
[javac] super(Version.LUCENE_44, input);
[javac]  ^
[javac]   symbol:   variable LUCENE_44
[javac]   location: class Version
[javac] 1 error
 ==
 
 Compiled it on trunk without problem.
 
 Is this patch supposed to work for 4.X?
 
 Regards,
 Patrick

RE: Problem with document routing with Solr 4.2.1

2013-05-23 Thread Noble Paul നോബിള്‍ नोब्ळ्

If that can help.. adding distrib=false or shard.keys= is giving back 
results.

-Original Message-
From: Jean-Sebastien Vachon [mailto:jean-sebastien.vac...@wantedanalytics.com] 
Sent: May-23-13 10:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Problem with document routing with Solr 4.2.1

I know. If a stop routing the documents and simply use a standard 'id' field 
then I am getting back my fields. 
I forgot to tell you how the collection was created.

http://localhost:8201/solr/admin/collections?action=CREATEname=CurrentnumShards=15replicationFactor=3maxShardsPerNode=9

Since I am using the numshards parameter then composite routing should be 
working... unless I misunderstood something

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: May-23-13 10:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Problem with document routing with Solr 4.2.1

That's strange. The default value of rows param is 10 so you should be 
getting 10 results back unless your StandardRequestHandler config in solrconfig 
has set rows to 0 or if none of your fields are stored.

On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon  
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my 
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+So
 lrCloud

 I set my document id to something like this  'fieldA!id' where fieldA 
 is the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same 
 shard).

 When I query my index, I can see that the number of documents increase 
 but there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/ 
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?

--
Regards,
Shalin Shekhar Mangar.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Mark Miller

Yes, this did change - it's actually a protection for a previous change though.

There was a time when you did a core reload by just making a new core with the 
same name and closing the old core - that is no longer really supported though 
- the proper way to do this is to use SolrCore#reload, and that has been the 
case for all of 4.x release if I remember right. I supported making this change 
to force people who might still be doing what is likely quite a buggy operation 
to switch to the correct code.

Sorry about the inconvenience.

- Mark

On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de wrote:

 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
  SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André

Re: fq facet on double and non-indexed field

2013-05-23 Thread Raymond Wiker

On May 23, 2013, at 14:25 , gpssolr2020 psgoms...@gmail.com wrote:
 Thanks Erick..
 
 
 i  hope we cant do q also on non-indexed field.
 
 Whats is the difference between q and fq other than cache .
 
 
 
 Thanks.


How do you expect to search on a field that is non-indexed (and thus 
non-searchable)?

RE: .skip.autorecovery=Y + restart solr after crash + losing many documents

2013-05-23 Thread Gilles Comeau

Hi Otis, 

Thank you for your reply.  I'm in the middle of that upgrade and will report 
back when testing is complete.   I'd like to get some nice set of reproducible 
steps so I'm not just ranting on. :)   

Regards,

Gilles

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] 
Sent: 20 May 2013 04:29
To: solr-user@lucene.apache.org
Subject: Re: .skip.autorecovery=Y + restart solr after crash + losing many 
documents

Hi Gilles,

Could you upgrade to 4.3.0 and see if you can reproduce?

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, May 13, 2013 at 5:26 PM, Gilles Comeau gilles.com...@polecat.co wrote:
 Hi all,

 We write to two same-named cores in the same collection for redundancy, and 
 are not taking advantage of the full benefits of solr cloud replication.

 We use solrcloud.skip.autorecovery=true so that Solr doesn't try to sync the 
 indexes when it starts up.

 However, we find that if the core is not optimized prior to shutting it down 
 (in a crash situation), we can lose all of the data after starting up.   The 
 files are written to disk, but we can lose a full 24 hours worth of data as 
 they are all removed when we start SOLR.  (I don't think it is a commit issue)

 If we optimize before shutting down, we never lose any data.   Sadly, 
 sometimes SOLR is in a state where optimizing is not an option.

 Can anyone think of why that might be?   Is there any special configuration 
 you need if you want to write directly to two cores rather than use 
 replication?   Version 4.0, this used to work in our 4.0 nightly build, but 
 broke when we migrated to 4.0 production.(until we test and migrate to 
 the replication setup - it won't be too long and I'm a bit embarrassed to be 
 asking this question!)

 Regards,

 Gilles

Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Alan Woodward

I think the wiki needs to be updated to reflect this?  
http://wiki.apache.org/solr/CoreAdmin

If somebody adds me as an editor (AlanWoodward), I'll do it.

Alan Woodward
www.flax.co.uk


On 23 May 2013, at 16:43, Mark Miller wrote:

 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported making 
 this change to force people who might still be doing what is likely quite a 
 buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André

Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Steve Rowe

Alan, I've added AlanWoodward to the Solr AdminGroup page.

On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:

 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin
 
 If somebody adds me as an editor (AlanWoodward), I'll do it.
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 16:43, Mark Miller wrote:
 
 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported making 
 this change to force people who might still be doing what is likely quite a 
 buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André

Re: Restaurant availability from database

2013-05-23 Thread rajh

Thank you for your answer.

Do you mean I should index the availability data as a document in Solr?
Because the availability data in our databases is around 6,509,972 records
and contains the availability per number of seats and per 15 minutes. I also
tried this method, and as far as I know it's only possible to join the
availability documents and not to include that information per result
document.

An example API response (created from the Solr response):
{
restaurants: [
{
id: 13906,
name: Allerlei,
zipcode: 6511DP,
house_number: 59,
available: true
},
{
id: 13907,
name: Voorbeeld,
zipcode: 6512DP,
house_number: 39,
available: false
}
],
resultCount: 12156,
resultCountAvailable: 55,
}

I'm currently hacking around the problem by executing the search again with
a very high value for the rows parameter and counting the number of
available restaurants on the backend, but this causes a big performance
impact (as expected).




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065710.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Alan Woodward

Thanks!

Alan Woodward
www.flax.co.uk


On 23 May 2013, at 17:38, Steve Rowe wrote:

 Alan, I've added AlanWoodward to the Solr AdminGroup page.
 
 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:
 
 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin
 
 If somebody adds me as an editor (AlanWoodward), I'll do it.
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 16:43, Mark Miller wrote:
 
 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported 
 making this change to force people who might still be doing what is likely 
 quite a buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler - ganzgraph gmbh


Hi,

thanx for pointing this out to me.

1152 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig 
 – Adding specified lib dirs to ClassLoader
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/home/christian/zfmk/solr/solr-4.3.0/example/lib/mysql-connector-java-5.1.25-bin.jar' 
to classloader


The mysql-connector-java DOES get loaded, but is not available to
org.apache.solr.core.SolrResourceLoader.findClass

Has something changed for the syntax creating a dataimport handler?

solrconfig.xml:
---
  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=configdata-config.xml/str
/lst
  /requestHandler

data-config.xml:

dataConfig
  dataSource type=JdbcDataSource
driver=com.mysql.jdbc.Driver
url=jdbc:mysql://localhost/koehler_zfmk
user=my_user
password=secret/

  document name=content
  entity name=rawidentificationid
query=SELECT * FROM foobar; 
  /entity
  /document
/dataConfig

I use this configuration successfully with 3.6

Regards
Chris


Am 23.05.2013 14:39, schrieb Jack Krupansky:

Check the Solr log on startup - it will explicitly state which lib
directories/files will be used. Make sure they agree with where the DIH
jars reside. Keep in mind that the directory structure of Solr changed -
use the lib from 4.3 solrconfig.

Try to use DIH in the standard Solr 4.3 example first. Then mimic that
in your customization.

-- Jack Krupansky

-Original Message- From: Christian Köhler
Sent: Thursday, May 23, 2013 8:25 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.3 fails to load MySQL driver


Hi,

in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue
loading the MySQL driver from the [instance]/lib dir:

Caused by: java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler
  at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
  at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:266)
  at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)

 ... 18 more

To narrow it down, I use the plain example configuration with the
following changes:

- Add a dataimport requestHandler to example/conf/solrconfig.xml
   (copied from a working solr 3.6.x)
- Created example/conf/data-config.xml with
   dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver ...
   and SQL statement (both copied from a working solr 3.6.x)
- placed the current driver mysql-connector-java-5.1.25-bin.jar in
   example/lib

As to my knowledge the lib dir is included automatically to the path. To
make sure I tried to:

- add lib dir=./lib / to explicit to solrconf.xml
- add absolute path to solrarconf.xml
- changed solr.xml to use solr persistent=true sharedLib=lib

All to no avail.

System Info:
- OpenJDK Runtime Environmentm 1.7.0_19
- Solr 4.3.0
- mysql-connector-java-5.1.25-bin.jar

The same configuration run fine with a solr 3.6.x on the very same machine.

Any help is appreciated!
Cheers
Chris






--
Christian Köhler

ganzgraph gmbh
Bornheimer Straße 37
53111 Bonn

koeh...@ganzgraph.de
http://www.ganzgraph.de/

Tel.: +49-(0)228-227 99 400
Fax : +49-(0)228-227 99 409

Geschäftsführer: Christian Köhler, Thorsten Orth
Unternehmenssitz: Bonn
Handelsregister-Nummer: HRB 19066 beim Amtsgericht: Bonn
UstId-Nr: DE 280482111

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Chris Hostetter


: in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue loading
: the MySQL driver from the [instance]/lib dir:
: 
: Caused by: java.lang.ClassNotFoundException:
: org.apache.solr.handler.dataimport.DataImportHandler

one of us is mistaken by what that error means.  you say it means that 
the MySQL driver isn't being loaded, but nothing in your mail suggests 
to me that there is a problem loading hte MySql driver.  what i see is 
that Solr can't seem to load the DIH class, suggesting that the 
dataimporthandler jar is not getting loaded.  

There may or nay not also be a problem loading the MySQL driver, but 
nothing is even going to attempt to do so unless Solr can successfully 
construct an instance of the DataImportHandler.

So unless there are more details to your error that start mentioning the 
MySql classes, i would check your lib settings for loading the DIH jars 
and make sure those are right.


-Hoss

Re: Fast faceting over large number of distinct terms

2013-05-23 Thread David Larochelle

Interesting solution. My concern is how to select the most frequent terms
in the story_text field in a way that would make sense to the user. Only
including the X most common non-stopword terms in a document could easily
cause important patterns to be missed. There's a similar issue with only
returning counts for terms in the top N documents matching a particular
query.

Also is there an efficient way to add term counts on the client side? I
thought of using the TermVectorComponent to get document level frequency
counts and then using something like Hadoop to add them up. However, I
couldn't find any documentation on using the results of a solr query to
feed a map reduce operation.

David

On Wed, May 22, 2013 at 11:12 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:

Here's a possibility:

At index time extract important terms (and/or phrases) from this
story_text and store top N of them in a separate field (which will be
much smaller/shorter). Then facet on that. Or just retrieve it and
manually parse and count in the client if that turns out to be faster.
I did this in the previous decade before Solr was available and it
worked well. I limited my counting to top N (200?) hits.

Otis
--
Solr ElasticSearch Support
http://sematext.com/

On Wed, May 22, 2013 at 10:54 PM, David Larochelle
dlaroche...@cyber.law.harvard.edu wrote:
The goal of the system is to obtain data that can be used to generate
word
clouds so that users can quickly get a sense of the aggregate contents of
all documents matching a particular query. For example, a user might want
to see a word cloud of all documents discussing 'Iraq' in a particular
new
papers.

Faceting on story_text gives counts of individual words rather than
entire
text strings. I think this is because of the tokenization that happens
automatically as part of the text_general type. I'm happy to look at
alternatives to faceting but I wasn't able to find one that
provided aggregate word counts for just the documents matching a
particular
query rather than an individual documents or the entire index.

David

On Wed, May 22, 2013 at 10:32 PM, Brendan Grainger
brendan.grain...@gmail.com wrote:

Hi David,

Out of interest, what are you trying to accomplish by faceting over the
story_text field? Is it generally the case that the story_text field
will
contain values that are repeated or categorize your documents somehow?
From your description: story_text is used to store free form text
obtained by crawling new papers and blogs, it doesn't seem that way, so
I'm not sure faceting is what you want in this situation.

Cheers,
Brendan

On Wed, May 22, 2013 at 9:49 PM, David Larochelle
dlaroche...@cyber.law.harvard.edu wrote:

I'm trying to quickly obtain cumulative word frequency counts over all
documents matching a particular query.

I'm running in Solr 4.3.0 on a machine with 16GB of ram. My index is
2.5
GB
and has around ~350,000 documents.

My schema includes the following fields:

field name=id type=string indexed=true stored=true
required=true
multiValued=false /
field name=media_id type=int indexed=true stored=true
required=true multiValued=false /
field name=story_text type=text_general indexed=true
stored=true
termVectors=true termPositions=true termOffsets=true /

story_text is used to store free form text obtained by crawling new
papers
and blogs.

Running faceted searches with the fc or fcs methods fails with the
error
Too many values for UnInvertedField faceting on field story_text

http://localhost:8983/solr/query?q=id:106714828_6621facet=truefacet.limit=10facet.pivot=publish_date,story_textrows=0facet.method=fcs

Running faceted search with the 'enum' method succeeds but takes a
very
long time.

http://localhost:8983/solr/query?q=includes:foobarfacet=truefacet.limit=100facet.pivot=media_id,includesfacet.method=enumrows=0

http://localhost:8983/solr/query?q=includes:mccainfacet=truefacet.limit=100facet.pivot=media_id,includesfacet.method=enumrows=0

The frustrating thing is even if the query only returns a few hundred
documents, it still takes 10 minutes or longer to get the cumulative
word
count results.

Eventually we're hoping to build a system that will return results in
a
few
seconds and scale to hundreds of millions of documents.
Is there anyway to get this level of performance out of Solr/Lucene?

Thanks,

David

--
Brendan Grainger
www.kuripai.com

Re: Upgrading from SOLR 3.5 to 4.2.1 Results.

Actually  , It's pretty high end for most of the users. Rishi, u can post
the real h/w details and our typical deployment .
No :of cpus per node
No :of disks per host
Vms per host
Gc params
No :of cores per instance

Noble Paul
Sent from phone
On 21 May 2013 01:47, Rishi Easwaran rishi.easwa...@aol.com wrote:

 No, we just upgraded to 4.2.1.
 With the size of our complex and effort required apply our patches and
 rollout, our upgrades are not that often.






 -Original Message-
 From: Noureddine Bouhlel nouredd...@ecotour.com
 To: solr-user solr-user@lucene.apache.org
 Sent: Mon, May 20, 2013 3:36 pm
 Subject: Re: Upgrading from SOLR 3.5 to 4.2.1 Results.


 Hi Rishi,

 Have you done any tests with Solr 4.3 ?

 Regards,


 Cordialement,

 BOUHLEL Noureddine



 On 17 May 2013 21:29, Rishi Easwaran rishi.easwa...@aol.com wrote:

 
 
  Hi All,
 
  Its Friday 3:00pm, warm  sunny outside and it was a good week. Figured
  I'd share some good news.
  I work for AOL mail team and we use SOLR for our mail search backend.
  We have been using it since pre-SOLR 1.4 and strong supporters of SOLR
  community.
  We deal with millions indexes and billions of requests a day across our
  complex.
  We finished full rollout of SOLR 4.2.1 into our production last week.
 
  Some key highlights:
  - ~75% Reduction in Search response times
  - ~50% Reduction in SOLR Disk busy , which in turn helped with ~90%
  Reduction in errors
  - Garbage collection total stop reduction by over 50% moving application
  throughput into the 99.8% - 99.9% range
  - ~15% reduction in CPU usage
 
  We did not tune our application moving from 3.5 to 4.2.1 nor update java.
  For the most part it was a binary upgrade, with patches for our special
  use case.
 
  Now going forward we are looking at prototyping SOLR Cloud for our search
  system, upgrade java and tomcat, tune our application further. Lots of
 fun
  stuff :)
 
  Have a great weekend everyone.
  Thanks,
 
  Rishi.

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler


Hi


one of us is mistaken by what that error means.  you say it means that
the MySQL driver isn't being loaded, but nothing in your mail suggests
to me that there is a problem loading hte MySql driver.  what i see is
that Solr can't seem to load the DIH class, suggesting that the
dataimporthandler jar is not getting loaded.


I corrected myself in my last mail: the MySQL driver IS loaded (thanx 
for pointing out to me where to look).



There may or nay not also be a problem loading the MySQL driver, but


I only SUSPECT of the MySQL driver being the culprit for the 
dataimporthandler jar is not getting loaded. Not sure!


 MySql classes, i would check your lib settings for loading the DIH
 jars

I am not using DIH. IMHO its just the plain example code in
solr-4.3.0/example/solr/collection1/ that is being called.

I include the full trace back to clarify my problem (hopefully)

Cheers
Chris


/home/solr-4.3.0/example# java -jar start.jar
0[main] INFO  org.eclipse.jetty.server.Server  – jetty-8.1.8.v20121106
19   [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider 
 – Deployment monitor /home/solr/solr-4.3.0/example/contexts at interval 0
24   [main] INFO  org.eclipse.jetty.deploy.DeploymentManager  – 
Deployable added: 
/home/solr/solr-4.3.0/example/contexts/solr-jetty-context.xml
653  [main] INFO  org.eclipse.jetty.webapp.StandardDescriptorProcessor 
– NO JSP Support for /solr, did not find 
org.apache.jasper.servlet.JspServlet

Null identity service, trying login service: null
Finding identity service: null
674  [main] INFO  org.eclipse.jetty.server.handler.ContextHandler  – 
started 
o.e.j.w.WebAppContext{/solr,file:/home/solr/solr-4.3.0/example/solr-webapp/webapp/},/home/solr/solr-4.3.0/example/webapps/solr.war
674  [main] INFO  org.eclipse.jetty.server.handler.ContextHandler  – 
started 
o.e.j.w.WebAppContext{/solr,file:/home/solr/solr-4.3.0/example/solr-webapp/webapp/},/home/solr/solr-4.3.0/example/webapps/solr.war
688  [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  – 
SolrDispatchFilter.init()
703  [main] INFO  org.apache.solr.core.SolrResourceLoader  – JNDI not 
configured for solr (NoInitialContextEx)
704  [main] INFO  org.apache.solr.core.SolrResourceLoader  – solr home 
defaulted to 'solr/' (could not find system property or JNDI)
713  [main] INFO  org.apache.solr.core.CoreContainer  – looking for solr 
config file: /home/solr/solr-4.3.0/example/solr/solr.xml
715  [main] INFO  org.apache.solr.core.CoreContainer  – New 
CoreContainer 1857140958
716  [main] INFO  org.apache.solr.core.CoreContainer  – Loading 
CoreContainer using Solr Home: 'solr/'
716  [main] INFO  org.apache.solr.core.SolrResourceLoader  – new 
SolrResourceLoader for directory: 'solr/'
962  [main] INFO  org.apache.solr.core.CoreContainer  – loading shared 
library: /home/solr/solr-4.3.0/example/solr/lib
962  [main] ERROR org.apache.solr.core.SolrResourceLoader  – Can't find 
(or read) file to add to classloader: solr/lib
971  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
socketTimeout to: 0
973  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
urlScheme to: http://
973  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
connTimeout to: 0
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
maxConnectionsPerHost to: 20
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
corePoolSize to: 0
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
maximumPoolSize to: 2147483647
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
maxThreadIdleTime to: 5
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
sizeOfQueue to: -1
975  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
fairnessPolicy to: false
980  [main] INFO  org.apache.solr.client.solrj.impl.HttpClientUtil  – 
Creating new http client, 
config:maxConnectionsPerHost=20maxConnections=1socketTimeout=0connTimeout=0retry=false
1073 [main] INFO  org.apache.solr.core.CoreContainer  – Registering Log 
Listener
1087 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.CoreContainer  – Creating SolrCore 'collection1' 
using instanceDir: solr/collection1
1088 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for 
directory: 'solr/collection1/'
1143 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig 
 – Adding specified lib dirs to ClassLoader
1144 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/home/solr/solr-4.3.0/example/lib/jetty-util-8.1.8.v20121106.jar' 
to classloader
1144 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/home/solr/solr-4.3.0/example/lib/servlet-api-3.0.jar' to

RE: Problem with document routing with Solr 4.2.1

I must add the shard.keys= does not return anything on two on my nodes. But 
that is to be expected since I'm using a replication factor of 3 on a cloud of 
5 servers

-Original Message-
From: Jean-Sebastien Vachon [mailto:jean-sebastien.vac...@wantedanalytics.com] 
Sent: May-23-13 11:27 AM
To: solr-user@lucene.apache.org
Subject: RE: Problem with document routing with Solr 4.2.1

If that can help.. adding distrib=false or shard.keys= is giving back 
results.

-Original Message-
From: Jean-Sebastien Vachon [mailto:jean-sebastien.vac...@wantedanalytics.com]
Sent: May-23-13 10:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Problem with document routing with Solr 4.2.1

I know. If a stop routing the documents and simply use a standard 'id' field 
then I am getting back my fields. 
I forgot to tell you how the collection was created.

http://localhost:8201/solr/admin/collections?action=CREATEname=CurrentnumShards=15replicationFactor=3maxShardsPerNode=9

Since I am using the numshards parameter then composite routing should be 
working... unless I misunderstood something

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: May-23-13 10:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Problem with document routing with Solr 4.2.1

That's strange. The default value of rows param is 10 so you should be 
getting 10 results back unless your StandardRequestHandler config in solrconfig 
has set rows to 0 or if none of your fields are stored.

On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon  
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my 
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+So
 lrCloud

 I set my document id to something like this  'fieldA!id' where fieldA 
 is the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same 
 shard).

 When I query my index, I can see that the number of documents increase 
 but there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/ 
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?

--
Regards,
Shalin Shekhar Mangar.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Chris Hostetter


: I only SUSPECT of the MySQL driver being the culprit for the dataimporthandler
: jar is not getting loaded. Not sure!

the dataimporthandler *class* is not getting loaded the 
dataimporthandler *jar* is not getting loaded.

:  MySql classes, i would check your lib settings for loading the DIH
:  jars
: 
: I am not using DIH. IMHO its just the plain example code in
: solr-4.3.0/example/solr/collection1/ that is being called.

i'm totally confused ... DIH == DataImportHandler ... it's just an 
acronym, you say you aren't using DIH, but you are having a problem 
loading DIH, so DIH is used in your configs.

: I include the full trace back to clarify my problem (hopefully)

...

: org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for
: directory: 'solr/collection1/'
: 1143 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig  –
: Adding specified lib dirs to ClassLoader
: 1144 [coreLoadExecutor-3-thread-1] INFO
: org.apache.solr.core.SolrResourceLoader  – Adding
: 'file:/home/solr/solr-4.3.0/example/lib/jetty-util-8.1.8.v20121106.jar' to
: classloader

...ok, for starters this makes no sense, and may be the cause of 
some problems.  you aparently have your collection1 configs setup to load 
all of the classes from the /home/solr/solr-4.3.0/example/example/lib 
directory as part of the collection1 classloader.

you really don't want to do that.  It will most likeley cause you all 
sorts of problems, even if it's unrelated to the current problem.


Second, note in particular all of the lines that look like that line above 
-- specifically lines that say org.apache.solr.core.SolrResourceLoader - 
Addming  to classloader.  besides the ones refering to 
/home/solr/solr-4.3.0/example/lib/ (which is almost certainly not what you 
want) you then have a bunch refering to contrib/extraction and 
contrib/langid, and contrib/velocity -- all of which is great, those 
plugins and their dependencies are now available to use.

but no where does it ever say anything about adding 
contrib/dataimporthandler jars to the classloader.

which means your config isn't setup to load any of hte dataimporthandler 
jars as plugins

which means when it's done loading plugins, and it starts to initialize 
things like RequestHandlers, and it finds a refrence to the 
DataImportHandler, it doesn't know what that means...

: Caused by: java.lang.ClassNotFoundException:
: org.apache.solr.handler.dataimport.DataImportHandler


if you look at the 4.3 DIH examples, you'll note that 
the only solrconfig.xml files that mention DataImportHandler also 
include lib directives like the following in order to load 
dataimporthandler as a plugin...


  lib dir=../../../../dist/ regex=solr-dataimporthandler-.*\.jar /
...
   requestHandler name=/dataimport 
class=org.apache.solr.handler.dataimport.DataImportHandler


-Hoss

AW: Core admin action CREATE fails for existing core

Mark, Alan,

thanks for explaining and updating the wiki.

When reloading the core using action=CREATE with Solr 4.1 I could specify the 
path to schema and config. In fact I used this to reconfigure the core to use a 
specific one of two prepared config files depending on some external index 
state (instead of making changes to one and the same config file). 

action=RELOAD does not understand the corresponding request parameters schema 
and config (which is why I used CREATE, not RELOAD in the first place). So 
the functionality to switch to a different config file for an existing core is 
no longer there, I guess?

Thanks,
André


Von: Alan Woodward [a...@flax.co.uk]
Gesendet: Donnerstag, 23. Mai 2013 18:43
An: solr-user@lucene.apache.org
Betreff: Re: Core admin action CREATE fails for existing core

Thanks!

Alan Woodward
www.flax.co.uk


On 23 May 2013, at 17:38, Steve Rowe wrote:

 Alan, I've added AlanWoodward to the Solr AdminGroup page.

 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:

 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin

 If somebody adds me as an editor (AlanWoodward), I'll do it.

 Alan Woodward
 www.flax.co.uk


 On 23 May 2013, at 16:43, Mark Miller wrote:

 Yes, this did change - it's actually a protection for a previous change 
 though.

 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported 
 making this change to force people who might still be doing what is likely 
 quite a buggy operation to switch to the correct code.

 Sorry about the inconvenience.

 - Mark

 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:

 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.

 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:

 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.

 Is this change intended?

 André

Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Mark Miller

Your right - that does seem to be a new limitation. Could you create a JIRA 
issue for it?

It would be fairly simple to add another reload method that also took the name 
of a new solrconfig/schema file.

- Mark

On May 23, 2013, at 4:11 PM, André Widhani andre.widh...@digicol.de wrote:

 Mark, Alan,
 
 thanks for explaining and updating the wiki.
 
 When reloading the core using action=CREATE with Solr 4.1 I could specify the 
 path to schema and config. In fact I used this to reconfigure the core to use 
 a specific one of two prepared config files depending on some external index 
 state (instead of making changes to one and the same config file). 
 
 action=RELOAD does not understand the corresponding request parameters 
 schema and config (which is why I used CREATE, not RELOAD in the first 
 place). So the functionality to switch to a different config file for an 
 existing core is no longer there, I guess?
 
 Thanks,
 André
 
 
 Von: Alan Woodward [a...@flax.co.uk]
 Gesendet: Donnerstag, 23. Mai 2013 18:43
 An: solr-user@lucene.apache.org
 Betreff: Re: Core admin action CREATE fails for existing core
 
 Thanks!
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 17:38, Steve Rowe wrote:
 
 Alan, I've added AlanWoodward to the Solr AdminGroup page.
 
 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:
 
 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin
 
 If somebody adds me as an editor (AlanWoodward), I'll do it.
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 16:43, Mark Miller wrote:
 
 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really 
 supported though - the proper way to do this is to use SolrCore#reload, 
 and that has been the case for all of 4.x release if I remember right. I 
 supported making this change to force people who might still be doing what 
 is likely quite a buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler - ganzgraph gmbh


Hi,


i'm totally confused ... DIH == DataImportHandler ... it's just an
acronym, you say you aren't using DIH, but you are having a problem
loading DIH, so DIH is used in your configs.



sorry for the confusion. I was just trying to say:
I use the example code from
solr-4.3.0/example/solr
and not from
solr-4.3.0/example/example-DIH



...ok, for starters this makes no sense, and may be the cause of
some problems.  you aparently have your collection1 configs setup to load
all of the classes from the /home/solr/solr-4.3.0/example/example/lib
directory as part of the collection1 classloader.

you really don't want to do that.  It will most likeley cause you all
sorts of problems, even if it's unrelated to the current problem.


For solr is was recomended to place the MySQL driver in
solr_3.6.2/example/lib/ This dir is load by default in 3.6 (as I did not 
add any additional lib dirs). Thats why I did this in 4.3 as well. 
What's the best practice to place third party libs?


I added example/lib/ to collection1/conf/solrconfig.xml as lib dir
Without this, the MySQL driver is not loaded according to the
org.apache.solr.core.SolrResourceLoader  – Adding xxx messages


but no where does it ever say anything about adding
contrib/dataimporthandler jars to the classloader.


collection1/conf/solrconfig.xml has the following lib dirs by default:
  lib dir=../../../contrib/extraction/lib regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-cell-\d.*\.jar /

  lib dir=../../../contrib/clustering/lib/ regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-clustering-\d.*\.jar /

  lib dir=../../../contrib/langid/lib/ regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-langid-\d.*\.jar /

  lib dir=../../../contrib/velocity/lib regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-velocity-\d.*\.jar /

Looks the same to me as in 3.6.



which means your config isn't setup to load any of hte dataimporthandler
jars as plugins


That means I have to configure the dataimporthandler manually in 4.3? If 
yes, this is the root of all problems ...





which means when it's done loading plugins, and it starts to initialize
things like RequestHandlers, and it finds a refrence to the
DataImportHandler, it doesn't know what that means...

: Caused by: java.lang.ClassNotFoundException:
: org.apache.solr.handler.dataimport.DataImportHandler


if you look at the 4.3 DIH examples, you'll note that
the only solrconfig.xml files that mention DataImportHandler also
include lib directives like the following in order to load
dataimporthandler as a plugin...


   lib dir=../../../../dist/ regex=solr-dataimporthandler-.*\.jar /


included this ... to no avail.


requestHandler name=/dataimport 
class=org.apache.solr.handler.dataimport.DataImportHandler



  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler

still does not load.

Regards
Chris

Core admin action CREATE fails to persist some settings in solr.xml with Solr 4.3

When I create a core with Core admin handler using these request parameters:

action=CREATE
name=core-tex69bbum21ctk1kq6lmkir-index3
schema=/etc/opt/dcx/solr/conf/schema.xml
instanceDir=/etc/opt/dcx/solr/
config=/etc/opt/dcx/solr/conf/solrconfig.xml
dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3

in Solr 4.1, solr.xml would have the following entry:

core schema=/etc/opt/dcx/solr/conf/schema.xml loadOnStartup=true 
instanceDir=/etc/opt/dcx/solr/ transient=false 
name=core-tex69bbum21ctk1kq6lmkir-index3 
config=/etc/opt/dcx/solr/conf/solrconfig.xml 
dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3/ 
collection=core-tex69bbum21ctk1kq6lmkir-index3/

while in Solr 4.3 schema, config and dataDir will be missing:

core loadOnStartup=true instanceDir=/etc/opt/dcx/solr/ 
transient=false name=core-tex69bbum21ctk1kq6lmkir-index3 
collection=core-tex69bbum21ctk1kq6lmkir-index3/

The new core would use the settings specified during CREATE, but after a Solr 
restart they are lost (fall back to some defaults), as they are not persisted 
in solr.xml.

Is this a bug or am I doing something wrong here?

André

AW: Core admin action CREATE fails for existing core

Ok - yes, will do so tomorrow.

Thanks,
André


Von: Mark Miller [markrmil...@gmail.com]
Gesendet: Donnerstag, 23. Mai 2013 22:46
An: solr-user@lucene.apache.org
Betreff: Re: Core admin action CREATE fails for existing core

Your right - that does seem to be a new limitation. Could you create a JIRA 
issue for it?

It would be fairly simple to add another reload method that also took the name 
of a new solrconfig/schema file.

- Mark

On May 23, 2013, at 4:11 PM, André Widhani andre.widh...@digicol.de wrote:

 Mark, Alan,

 thanks for explaining and updating the wiki.

 When reloading the core using action=CREATE with Solr 4.1 I could specify the 
 path to schema and config. In fact I used this to reconfigure the core to use 
 a specific one of two prepared config files depending on some external index 
 state (instead of making changes to one and the same config file).

 action=RELOAD does not understand the corresponding request parameters 
 schema and config (which is why I used CREATE, not RELOAD in the first 
 place). So the functionality to switch to a different config file for an 
 existing core is no longer there, I guess?

 Thanks,
 André

 
 Von: Alan Woodward [a...@flax.co.uk]
 Gesendet: Donnerstag, 23. Mai 2013 18:43
 An: solr-user@lucene.apache.org
 Betreff: Re: Core admin action CREATE fails for existing core

 Thanks!

 Alan Woodward
 www.flax.co.uk


 On 23 May 2013, at 17:38, Steve Rowe wrote:

 Alan, I've added AlanWoodward to the Solr AdminGroup page.

 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:

 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin

 If somebody adds me as an editor (AlanWoodward), I'll do it.

 Alan Woodward
 www.flax.co.uk


 On 23 May 2013, at 16:43, Mark Miller wrote:

 Yes, this did change - it's actually a protection for a previous change 
 though.

 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really 
 supported though - the proper way to do this is to use SolrCore#reload, 
 and that has been the case for all of 4.x release if I remember right. I 
 supported making this change to force people who might still be doing what 
 is likely quite a buggy operation to switch to the correct code.

 Sorry about the inconvenience.

 - Mark

 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:

 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.

 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:

 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.

 Is this change intended?

 André

Warning: no uniqueKey specified in schema.

2013-05-23 Thread O. Olson

Hi,

I just downloaded Apache Solr 4.3.0 from 
http://lucene.apache.org/solr/. I
then got into the /example directory and started Solr with: 

 java -Djava.util.logging.config.file=etc/logging.properties
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar

I have not made any changes at this point and I get the following Warning:
no uniqueKey specified in schema. 

I have no clue why this error occurs because the schema.xml has
uniqueKeyid/uniqueKey. Isn’t this correctly defined?? I have not changed
the examples in any way, just ran them. I would like to add that if I use
the normal Solr (not the one with the DataImportHandler): 

 java -Djava.util.logging.config.file=etc/logging.properties -jar start.jar

This warning does not occur here. I’d appreciate any clues on why this
warning occurs in the example-DIH.

Thank you,
O. O.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Warning-no-uniqueKey-specified-in-schema-tp4065791.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Warning: no uniqueKey specified in schema.

2013-05-23 Thread Shawn Heisey


On 5/23/2013 3:50 PM, O. Olson wrote:

I just downloaded Apache Solr 4.3.0 from 
http://lucene.apache.org/solr/. I
then got into the /example directory and started Solr with:


java -Djava.util.logging.config.file=etc/logging.properties
-Dsolr.solr.home=./example-DIH/solr/ -jar start.jar


I have not made any changes at this point and I get the following Warning:
no uniqueKey specified in schema.


One of the cores defined in example-DIH, specifically the one named 
tika, does not have uniqueKey in its schema.


example/example-DIH/solr/tika/conf/schema.xml

Thanks,
Shawn

Re: Restaurant availability from database

2013-05-23 Thread Amit Nithian

Hossman did a presentation on something similar to this using spatial data
at a Solr meetup some months ago.

http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/

May be helpful to you.


On Thu, May 23, 2013 at 9:40 AM, rajh ron...@trimm.nl wrote:

 Thank you for your answer.

 Do you mean I should index the availability data as a document in Solr?
 Because the availability data in our databases is around 6,509,972 records
 and contains the availability per number of seats and per 15 minutes. I
 also
 tried this method, and as far as I know it's only possible to join the
 availability documents and not to include that information per result
 document.

 An example API response (created from the Solr response):
 {
 restaurants: [
 {
 id: 13906,
 name: Allerlei,
 zipcode: 6511DP,
 house_number: 59,
 available: true
 },
 {
 id: 13907,
 name: Voorbeeld,
 zipcode: 6512DP,
 house_number: 39,
 available: false
 }
 ],
 resultCount: 12156,
 resultCountAvailable: 55,
 }

 I'm currently hacking around the problem by executing the search again with
 a very high value for the rows parameter and counting the number of
 available restaurants on the backend, but this causes a big performance
 impact (as expected).




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065710.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Note on The Book

To those of you who may have heard about the Lucene/Solr book that I and two 
others are writing on Lucene and Solr, some bad and good news. The bad news: 
The book contract with O’Reilly has been canceled. The good news: I’m going to 
proceed with self-publishing (possibly on Lulu or even Amazon) a somewhat 
reduced scope Solr-only Reference Guide (with hints of Lucene). The scope of 
the previous effort was too great, even for O’Reilly – a book larger than 800 
pages (or even 600) that was heavy on reference and lighter on “guide” just 
wasn’t fitting in with their traditional “guide” model. In truth, Solr is just 
too complex for a simple guide that covers it all, let alone Lucene as well.

I’ll announce more details in the coming weeks, but I expect to publish an 
e-book-only version of the book, focused on Solr reference (and plenty of guide 
as well), possibly on Lulu, plus eventually publish 4-8 individual print 
volumes for people who really want the paper. One model I may pursue is to 
offer the current, incomplete, raw, rough, draft as a $7.99 e-book, with the 
promise of updates every two weeks or a month as new and revised content and 
new releases of Solr become available. Maybe the individual e-book volumes 
would be $2 or $3. These are just preliminary ideas. Feel free to let me know 
what seems reasonable or excessive.

For paper: Do people really want perfect bound, or would you prefer spiral 
bound that lies flat and folds back easily? I suppose we could offer both – 
which should be considered “premium”?

I’ll announce more details next week. The immediate goal will be to get the 
“raw rough draft” available to everyone ASAP.

For those of you who have been early reviewers – your effort will not have been 
in vain. I have all your comments and will address them over the next month or 
two or three.

Just for some clarity, the existing Solr Wiki and even the recent contribution 
of the LucidWorks Solr Reference to Apache really are still great contributions 
to general knowledge about Solr, but the book is intended to go much deeper 
into detail, especially with loads of examples and a lot more narrative guide. 
For example, the book has a complete list of the analyzer filters, each with a 
clean one-liner description. Ditto for every parameter (although I would note 
that the LucidWorks Solr Reference does a decent job of that as well.) Maybe, 
eventually, everything in the book COULD (and will) be integrated into the 
standard Solr doc, but until then, a single, integrated reference really is 
sorely needed. And, the book has a lot of narrative guide and walking through 
examples as well. Over time, I’m sure both will evolve. And just to be clear, 
the book is not a simple repurposing of the Solr wiki content – EVERY 
description of everything has been written fresh, from scratch. So, for 
example, analyzer filters get both short one-liner summary descriptions as well 
as more detailed descriptions, plus formal attribute specifications and 
numerous examples, including sample input and outputs (the LucidWorks Solr 
Reference does a better job with examples as well.)

The book has been written in parallel with branch_4x and that will continue.

-- Jack Krupansky

Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Shankar Sundararaju

Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
  :
  :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

  yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
 :
/doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dftext/str
   str name=defTypeedismax/str
   str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
 /lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson erickerick...@gmail.comwrote:

 Please post the results of adding debug=query to the URL.
 That'll tell us what the query parser spits out which is much
 easier to analyze.

 Best
 Erick

 On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
 shan...@ebrary.com wrote:
  This query returns 0 documents: *q=(+Title:() +Classification:()
  +Contributors:() +text:())*
 
  This returns 1 document: *q=doc-id:3000*
 
  And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
  AND (+Title:() +Classification:() +Contributors:() +text:())*
 
  Am I missing something here? Can someone please explain? I am using Solr
  4.2.1
 
  Thanks
  -Shankar




-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)

FW: howto: get the value from a multivalued field?

2013-05-23 Thread world hello





hi, all - 
how can I retrieve the value out of a multivalued field in a customized 
function query?I want to implement a function query whose first parameter is a 
multi-value fileld, from which values are retrieved and manipulated. 
however, I used the code but get exceptions - can not use FieldCache on 
multivalued field
/public ValueSource parse(FunctionQParser fp) 
throws ParseException {
try { ValueSource vs = fp.parseValueSource();   }   
catch (...)   {   }
Thanks.
- Frank

Re: howto: get the value from a multivalued field?

Yeah, you can't do that. You'll need to keep a copy of whichever value from 
the multi-valued field you wish to be considered the value in a separate, 
non-multi-valued field. Possibly using an update processor, such as one of:


FirstFieldValueUpdateProcessorFactory, LastFieldValueUpdateProcessorFactory, 
MaxFieldValueUpdateProcessorFactory, MinFieldValueUpdateProcessorFactory


-- Jack Krupansky

-Original Message- 
From: world hello

Sent: Thursday, May 23, 2013 7:50 PM
To: solr-user@lucene.apache.org
Subject: FW: howto: get the value from a multivalued field?





hi, all -
how can I retrieve the value out of a multivalued field in a customized 
function query?I want to implement a function query whose first parameter is 
a multi-value fileld, from which values are retrieved and manipulated.
however, I used the code but get exceptions - can not use FieldCache on 
multivalued field
/public ValueSource parse(FunctionQParser fp) 
throws ParseException {
   try { ValueSource vs = fp.parseValueSource();   } 
catch (...)   {   }

Thanks.
- Frank

RE: howto: get the value from a multivalued field?

2013-05-23 Thread world hello

thanks, jack. 
could you please  give more details on using update processor?
Thanks.
- frank

 From: j...@basetechnology.com
 To: solr-user@lucene.apache.org
 Subject: Re: howto: get the value from a multivalued field?
 Date: Thu, 23 May 2013 20:06:34 -0400
 
 Yeah, you can't do that. You'll need to keep a copy of whichever value from 
 the multi-valued field you wish to be considered the value in a separate, 
 non-multi-valued field. Possibly using an update processor, such as one of:
 
 FirstFieldValueUpdateProcessorFactory, LastFieldValueUpdateProcessorFactory, 
 MaxFieldValueUpdateProcessorFactory, MinFieldValueUpdateProcessorFactory
 
 -- Jack Krupansky
 
 -Original Message- 
 From: world hello
 Sent: Thursday, May 23, 2013 7:50 PM
 To: solr-user@lucene.apache.org
 Subject: FW: howto: get the value from a multivalued field?
 
 
 
 
 
 hi, all -
 how can I retrieve the value out of a multivalued field in a customized 
 function query?I want to implement a function query whose first parameter is 
 a multi-value fileld, from which values are retrieved and manipulated.
 however, I used the code but get exceptions - can not use FieldCache on 
 multivalued field
 /public ValueSource parse(FunctionQParser fp) 
 throws ParseException {
 try { ValueSource vs = fp.parseValueSource();   } 
 catch (...)   {   }
 Thanks.
 - Frank

Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Upayavira

(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 |
Title:and^3.0/no_coord

You're using edismax, not lucene. So AND is being considered as a search
term, not an operator, and the word 'and' probably exists in 631580
documents.

Why is it triggering dismax? Probably because field:() is not valid
syntax, so edismax is dropping to dismax because it isn't a valid lucene
query.

What do you expect text:() to do?

If you want to match any docs that have a value in the text field, use
q=text:[* TO *]

To match docs that *don't* have a value in the text field: q=-text[* TO
*]

Upayavira

On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
 Hi Erick,
 
 Here's the output after turning on the debug flag:
 
 *q=text:()debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=indenttrue/str
 str name=qtext:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/result
 lst name=debug
 str name=rawquerystringtext:()/str
 str name=querystringtext:()/str
 str name=parsedquery(+())/no_coord/str
 str name=parsedquery_toString+()/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=qdoc-id:3000/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=1 start=0 maxScore=11.682044
 doc
   :
   :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000/str
 str name=querystringdoc-id:3000/str
 str name=parsedquery(+doc-id:3000)/no_coord/str
 str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000 AND text:()debug=query*
 
   yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime23/int
 lst name=params
 str name=qdoc-id:3000 AND text:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=631647 start=0 maxScore=8.056607
 doc
  :
 /doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000 AND text:()/str
 str name=querystringdoc-id:3000 AND text:()/str
 str name=parsedquery
 (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 |
 Title:and^3.0/no_coord
 /str
 str name=parsedquery_toString
 +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
 /str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *solrconfig.xml:*
 requestHandler name=/select class=solr.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftext/str
str name=defTypeedismax/str
str name=qftext^1.0 Title^3.0 Classification^2.0
 Contributors^2.0 Publisher^2.0/str
  /lst
 
 *schema.xml:*
 field name=text type=my_text indexed=true stored=false required=
 false/*
 *
 dynamicField name=* type=my_text indexed=true stored=true
 multiValued=false/
 fieldType name=my_text class=solr.TextField analyzer type=index
 class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/
 analyzer
 type=multiterm class=MyAnalyzer/ /fieldType
 *
 *
 *Note:* MyAnalyzer among few other customizations, uses
 WhitespaceTokenizer
 and LoweCaseFilter
 
 Thanks a lot.
 
 -Shankar
 
 
 On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
 erickerick...@gmail.comwrote:
 
  Please post the results of adding debug=query to the URL.
  That'll tell us what the query parser spits out which is much
  easier to analyze.
 
  Best
  Erick
 
  On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
  shan...@ebrary.com wrote:
   This query returns 0 documents: *q=(+Title:() +Classification:()
   +Contributors:() +text:())*
  
   This returns 1 document: *q=doc-id:3000*
  
   And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
   AND (+Title:() +Classification:() +Contributors:() +text:())*
  
   Am I missing something here? Can someone please explain? I am using Solr
   4.2.1
  
   Thanks
   -Shankar
 
 
 
 
 -- 
 Regards,
 *Shankar Sundararaju
 *Sr. Software Architect
 ebrary, a ProQuest company
 410 Cambridge Avenue, Palo Alto, CA 94306 USA
 shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)

Re: Can anyone explain this Solr query behavior?