date:20140225

Urgent_Can not index binary data stored in DB as BLOB type

2014-02-25 Thread Chandan khatua

Sir,

 

Please send me the data-config file to index binary data which are stored in
Database as BLOB type.

 

Thanking you,

 

Chandan

Re: Urgent_Can not index binary data stored in DB as BLOB type

2014-02-25 Thread Gora Mohanty

On 25 February 2014 14:27, Chandan khatua chand...@nrifintech.com wrote:
 Sir,



 Please send me the data-config file to index binary data which are stored in
 Database as BLOB type.

Are you paying attention to the follow-ups? I had suggested
possibilities, including the fact that Solr cannot automatically
decide whether a blob contains richtext or not.

Please do not start multiple threads for the same issue.

Regards,
Gora

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Chandan khatua

Hi Gora,

The column type in DB is BLOB. It only stores binary data.

If I do not use TikaEntityProcessor, then the following exception occurs:

at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:457)
59163 [Thread-16] ERROR org.apache.solr.handler.dataimport.DocBuilder  û
Exception while processing: messages document : SolrInputDocument(fields:
[id
=2158]):org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.ClassCastException: oracle.jdbc.driver.OracleBlobInputStream
cannot b
e cast to java.util.Iterator
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityPro
cessor.java:65)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProce
ssor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProc
essorWrapper.java:243)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:
469)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:
495)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:
408)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:323
)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:231)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.ja
va:411)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:476
)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:457)
Caused by: java.lang.ClassCastException:
oracle.jdbc.driver.OracleBlobInputStream cannot be cast to
java.util.Iterator
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityPro
cessor.java:59)
... 10 more



I have used ClobTransformer in data-config file as bellow and even then it
is not working:

dataConfig
dataSource name=db driver=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@//a.a.a.a:a/d11gr21 user= password=a /
 dataSource name=dastream type=FieldStreamDataSource/
 document
  entity 
  name=messages pk=x_MSG_PK 
  query=select * from table1
  dataSource=db
 field column =x_MSG_PK name =id /
entity name=message
transformer=ClobTransformer
dataSource=dastream
 processor=TikaEntityProcessor
  dataField=messages.MESSAGE
format=text
field column=text name=mxMsg clob=true/
/entity
/entity 
 /document
/dataConfig


So, what changes do I need?

-Chandan


-Original Message-
From: Gora Mohanty [mailto:g...@mimirtech.com] 
Sent: Monday, February 24, 2014 5:49 PM
To: solr-user@lucene.apache.org
Subject: Re: Can not index raw binary data stored in Database in BLOB
format.

On 24 February 2014 15:34, Chandan khatua chand...@nrifintech.com wrote:
 Hi Gora !

 Your concern was What is the type of the column used to store the 
 binary data in Oracle?
 The column type is BLOB in DB.  The column can also have rich text file.

Um, your original message said that it does *not* contain richtext data. How
do you tell whether it has richtext data, or not? For just a binary blob,
the ClobTransformer should work, but you need the TikaEntityProcessor for
richtext data. If you do not know whether the data in the blob is richtext
or not, you will need to roll your own solution to determine that.

Regards,
Gora

Re: Fetching uniqueKey and other int quickly from documentCache?

2014-02-25 Thread Shalin Shekhar Mangar

I vaguely remember such a Jira issue but I can't find it now.

Gregg, can you open an issue? A patch would be even better.


On Tue, Feb 25, 2014 at 8:28 AM, Gregg Donovan gregg...@gmail.com wrote:
 We fetch a large number of documents -- 1000+ -- for each search. Each
 request fetches only the uniqueKey or the uniqueKey plus one secondary
 integer key. Despite this, we find that we spent a sizable amount of time
 in SolrIndexSearcher#doc(int docId, SetString fields). Time is spent
 fetching the two stored fields, LZ4 decoding, etc.

 I would love to be able to tell Solr to always fetch these two fields from
 memory. We have them both in the fieldCache so we're already spending the
 RAM. I've seen this asked previously [1], so it seems like a fairly common
 need, especially for distributed search. Any ideas?

 A few possible ideas I had:

 --Check FieldCache.html#getCacheEntries() before going to stored fields.
 --Give the documentCache config a list of fields it should load from the
 fieldCache


 Having an in-memory mapping from docId-uniqueKey has come up for us
 before. We've used a custom SolrCache maintaining that mapping to quickly
 filter over personalized collections. Maybe the uniqueKey should be more
 optimized out of the box? Perhaps a custom uniqueKey codec that also
 maintained the docId-uniqueKey mapping in memory?

 --Gregg

 [1] http://search-lucene.com/m/oCUKJ1heHUU1



-- 
Regards,
Shalin Shekhar Mangar.

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Gora Mohanty

On 25 February 2014 14:54, Chandan khatua chand...@nrifintech.com wrote:
 Hi Gora,

 The column type in DB is BLOB. It only stores binary data.

 If I do not use TikaEntityProcessor, then the following exception occurs:
[...]

It is difficult to follow what you are doing when you say one thing, and
seem to do another. You say above that you are not using TikaEntityProcessor
but your DIH data configuration file shows that you are. Please start with
one configuration, and show us the *exact* files in use, and the error from
the Solr logs.

Regards,
Gora

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Chandan khatua

Okey.

Here is my data-config file:

?xml version=1.0 encoding=UTF-8 ?
dataConfig
dataSource name=db driver=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@//1.2.3.4:1/d11gr21 user= password= /
 dataSource name=dastream type=FieldStreamDataSource/
 document
  entity 
  name=messages pk=X_MSG_PK 
  query=select * from table1
  dataSource=db
 field column =X_MSG_PK name =id /
entity name=message
transformer=ClobTransformer
dataSource=dastream
processor=TikaEntityProcessor
dataField=messages.MESSAGE
 format=text
field column=text name=mxMsg clob=true/
/entity
/entity 
 /document
/dataConfig


--

Solr.log file :

INFO  - 2014-02-25 17:33:40.023; org.apache.solr.core.SolrCore; [CHESS_CORE]
webapp=/solr path=/admin/mbeans
params={cat=QUERYHANDLER_=1393329819994wt=json} status=0 QTime=1 
INFO  - 2014-02-25 17:33:40.094; org.apache.solr.core.SolrCore; [CHESS_CORE]
webapp=/solr path=/admin/mbeans
params={cat=QUERYHANDLER_=1393329820083wt=json} status=0 QTime=0 
INFO  - 2014-02-25 17:33:40.117; org.apache.solr.core.SolrCore; [CHESS_CORE]
webapp=/solr path=/dataimport
params={indent=truecommand=status_=1393329820089wt=json} status=0
QTime=16 
INFO  - 2014-02-25 17:33:40.131; org.apache.solr.core.SolrCore; [CHESS_CORE]
webapp=/solr path=/dataimport
params={indent=truecommand=show-config_=1393329820084} status=0 QTime=29 
INFO  - 2014-02-25 17:33:42.026;
org.apache.solr.handler.dataimport.DataImporter; Loading DIH Configuration:
/dataconfig/data-config.xml
INFO  - 2014-02-25 17:33:42.031;
org.apache.solr.handler.dataimport.DataImporter; Data Configuration loaded
successfully
INFO  - 2014-02-25 17:33:42.033; org.apache.solr.core.SolrCore; [CHESS_CORE]
webapp=/solr path=/dataimport
params={optimize=falseindent=trueclean=truecommit=trueverbose=falsecomm
and=full-importdebug=falsewt=json} status=0 QTime=8 
INFO  - 2014-02-25 17:33:42.035;
org.apache.solr.handler.dataimport.DataImporter; Starting Full Import
INFO  - 2014-02-25 17:33:42.043; org.apache.solr.core.SolrCore; [CHESS_CORE]
webapp=/solr path=/dataimport
params={indent=truecommand=status_=1393329822040wt=json} status=0 QTime=0

INFO  - 2014-02-25 17:33:42.064;
org.apache.solr.handler.dataimport.SimplePropertiesWriter; Read
dataimport.properties
INFO  - 2014-02-25 17:33:42.092; org.apache.solr.search.SolrIndexSearcher;
Opening Searcher@2a858a73 realtime
INFO  - 2014-02-25 17:33:42.093;
org.apache.solr.handler.dataimport.JdbcDataSource$1; Creating a connection
for entity messages with URL: jdbc:oracle:thin:@//172.16.29.92:1521/d11gr21
INFO  - 2014-02-25 17:33:42.113;
org.apache.solr.handler.dataimport.JdbcDataSource$1; Time taken for
getConnection(): 19
INFO  - 2014-02-25 17:33:42.564;
org.apache.solr.handler.dataimport.DocBuilder; Import completed successfully
INFO  - 2014-02-25 17:33:42.564;
org.apache.solr.update.DirectUpdateHandler2; start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=fa
lse,softCommit=false,prepareCommit=false}
INFO  - 2014-02-25 17:33:42.867; org.apache.solr.core.SolrDeletionPolicy;
SolrDeletionPolicy.onCommit: commits: num=2

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C:\solr
-4.5.1\example\multicore\CHESS_CORE\data\index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c6d8073;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_l,generation=21}

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C:\solr
-4.5.1\example\multicore\CHESS_CORE\data\index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c6d8073;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_m,generation=22}
INFO  - 2014-02-25 17:33:42.868; org.apache.solr.core.SolrDeletionPolicy;
newest commit generation = 22
INFO  - 2014-02-25 17:33:42.882; org.apache.solr.search.SolrIndexSearcher;
Opening Searcher@558ea0cc main
INFO  - 2014-02-25 17:33:42.886; org.apache.solr.core.QuerySenderListener;
QuerySenderListener sending requests to Searcher@558ea0cc
main{StandardDirectoryReader(segments_m:55:nrt _d(4.5.1):C80)}
INFO  - 2014-02-25 17:33:42.889; org.apache.solr.core.QuerySenderListener;
QuerySenderListener done.
INFO  - 2014-02-25 17:33:42.889; org.apache.solr.core.SolrCore; [CHESS_CORE]
Registered new searcher Searcher@558ea0cc
main{StandardDirectoryReader(segments_m:55:nrt _d(4.5.1):C80)}
INFO  - 2014-02-25 17:33:42.893;
org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO  - 2014-02-25 17:33:42.899;
org.apache.solr.handler.dataimport.SimplePropertiesWriter; Read
dataimport.properties
INFO  - 2014-02-25 17:33:42.901;
org.apache.solr.handler.dataimport.SimplePropertiesWriter; Wrote last
indexed time to dataimport.properties
INFO  - 2014-02-25

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Raymond Wiker

A few things:

1) If your database uses a BLOB, you should not use clobtransformer;
FieldStreamDataSource should be sufficient.

2) In a previous message, it showed that the converted/etxracted document
was empty (except for an html boilerplate wrapper). This was using the
configuration I suggested.

I'm guessing that TikaEntityProcessor is either receiving empty strings as
source, or failing to extract the content of certain file formats. To test
the latter, you could export one of the blobs to a file, and run the
stan-aloen tika app on it.

As to the possibility that TikaEntitiyProcessor is receiving empty strings
as input: I had a similar issue, but with varchars. In my case, the reason
was that I was running a really old version of Oracle, which would not work
with recent versions of the Oracle support libraries.

Another thing that might be worth checking: your main query uses select *
... as the main query. Have you tried explicitly listing the columns
you're interested in? Something like select X_MSG_PK, MESSAGE from table1.


On Tue, Feb 25, 2014 at 1:11 PM, Chandan khatua chand...@nrifintech.comwrote:

 Okey.

 Here is my data-config file:

 ?xml version=1.0 encoding=UTF-8 ?
 dataConfig
 dataSource name=db driver=oracle.jdbc.driver.OracleDriver
 url=jdbc:oracle:thin:@//1.2.3.4:1/d11gr21 user= password= /
  dataSource name=dastream type=FieldStreamDataSource/
  document
   entity
   name=messages pk=X_MSG_PK
   query=select * from table1
   dataSource=db
  field column =X_MSG_PK name =id /
 entity name=message
 transformer=ClobTransformer
 dataSource=dastream
 processor=TikaEntityProcessor
 dataField=messages.MESSAGE
  format=text
 field column=text name=mxMsg clob=true/
 /entity
 /entity
  /document
 /dataConfig


 
 --

 Solr.log file :

 INFO  - 2014-02-25 17:33:40.023; org.apache.solr.core.SolrCore;
 [CHESS_CORE]
 webapp=/solr path=/admin/mbeans
 params={cat=QUERYHANDLER_=1393329819994wt=json} status=0 QTime=1
 INFO  - 2014-02-25 17:33:40.094; org.apache.solr.core.SolrCore;
 [CHESS_CORE]
 webapp=/solr path=/admin/mbeans
 params={cat=QUERYHANDLER_=1393329820083wt=json} status=0 QTime=0
 INFO  - 2014-02-25 17:33:40.117; org.apache.solr.core.SolrCore;
 [CHESS_CORE]
 webapp=/solr path=/dataimport
 params={indent=truecommand=status_=1393329820089wt=json} status=0
 QTime=16
 INFO  - 2014-02-25 17:33:40.131; org.apache.solr.core.SolrCore;
 [CHESS_CORE]
 webapp=/solr path=/dataimport
 params={indent=truecommand=show-config_=1393329820084} status=0 QTime=29
 INFO  - 2014-02-25 17:33:42.026;
 org.apache.solr.handler.dataimport.DataImporter; Loading DIH Configuration:
 /dataconfig/data-config.xml
 INFO  - 2014-02-25 17:33:42.031;
 org.apache.solr.handler.dataimport.DataImporter; Data Configuration loaded
 successfully
 INFO  - 2014-02-25 17:33:42.033; org.apache.solr.core.SolrCore;
 [CHESS_CORE]
 webapp=/solr path=/dataimport

 params={optimize=falseindent=trueclean=truecommit=trueverbose=falsecomm
 and=full-importdebug=falsewt=json} status=0 QTime=8
 INFO  - 2014-02-25 17:33:42.035;
 org.apache.solr.handler.dataimport.DataImporter; Starting Full Import
 INFO  - 2014-02-25 17:33:42.043; org.apache.solr.core.SolrCore;
 [CHESS_CORE]
 webapp=/solr path=/dataimport
 params={indent=truecommand=status_=1393329822040wt=json} status=0
 QTime=0

 INFO  - 2014-02-25 17:33:42.064;
 org.apache.solr.handler.dataimport.SimplePropertiesWriter; Read
 dataimport.properties
 INFO  - 2014-02-25 17:33:42.092; org.apache.solr.search.SolrIndexSearcher;
 Opening Searcher@2a858a73 realtime
 INFO  - 2014-02-25 17:33:42.093;
 org.apache.solr.handler.dataimport.JdbcDataSource$1; Creating a connection
 for entity messages with URL: jdbc:oracle:thin:@//
 172.16.29.92:1521/d11gr21
 INFO  - 2014-02-25 17:33:42.113;
 org.apache.solr.handler.dataimport.JdbcDataSource$1; Time taken for
 getConnection(): 19
 INFO  - 2014-02-25 17:33:42.564;
 org.apache.solr.handler.dataimport.DocBuilder; Import completed
 successfully
 INFO  - 2014-02-25 17:33:42.564;
 org.apache.solr.update.DirectUpdateHandler2; start

 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=fa
 lse,softCommit=false,prepareCommit=false}
 INFO  - 2014-02-25 17:33:42.867; org.apache.solr.core.SolrDeletionPolicy;
 SolrDeletionPolicy.onCommit: commits: num=2

 commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C
 :\solr
 -4.5.1\example\multicore\CHESS_CORE\data\index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c6d8073;
 maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_l,generation=21}

 commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C
 :\solr
 -4.5.1\example\multicore\CHESS_CORE\data\index

Performance problem on Solr query on stemmed values

2014-02-25 Thread Erwin Gunadi

Hi,

 

I would like to know whether anyone have experienced this kind of phenomena.

 

We are having performance problem regarding query on stemmed value.

I've documented the symptoms which I'm currently facing:

 


Search on field content

Search on field spell

Highlighting (on content field)

Processing speed


active

active

Active

Slow


active

not active

Active

Fast


active

active

not active

Fast


not active

active

Active

Slow


not active

active

not active

Fast

 

*Fast means 1000x faster than slow.

 

Field Content is our index field, which holds original text, and spell is
the field with stemmed value.

According to my measurement result, search on both fields (stemmed and not
stemmed) is really fast.

But when I start to take highlighting into our query it takes too long to
process.

 

Best Regards

Erwin

programmatically disable/enable solr queryResultCache...

2014-02-25 Thread Senthilnathan Vijayaraja

is there any way programmatically disable/enable solr queryResultCache?

I am using SolrJ.


Thanks  Regards,
Senthilnathan V

Re: Performance problem on Solr query on stemmed values

2014-02-25 Thread Erick Erickson

Right, highlighting may have to re-analyze the input in order
to return the highlighted data. This will be significantly slower
than the search, especially if you have a large number
of rows you're returning.

You can get better performance in highlighting by using
FastVectorHighlighter. See:

https://cwiki.apache.org/confluence/display/solr/FastVector+Highlighter

1000x is unusual, though, unless your fields are very large or
you're returning a lot of documents.

Best,
Erick


On Tue, Feb 25, 2014 at 5:23 AM, Erwin Gunadi festiva.s...@gmail.comwrote:

 Hi,



 I would like to know whether anyone have experienced this kind of
 phenomena.



 We are having performance problem regarding query on stemmed value.

 I've documented the symptoms which I'm currently facing:




 Search on field content

 Search on field spell

 Highlighting (on content field)

 Processing speed


 active

 active

 Active

 Slow


 active

 not active

 Active

 Fast


 active

 active

 not active

 Fast


 not active

 active

 Active

 Slow


 not active

 active

 not active

 Fast



 *Fast means 1000x faster than slow.



 Field Content is our index field, which holds original text, and spell is
 the field with stemmed value.

 According to my measurement result, search on both fields (stemmed and not
 stemmed) is really fast.

 But when I start to take highlighting into our query it takes too long to
 process.



 Best Regards

 Erwin

Re: programmatically disable/enable solr queryResultCache...

2014-02-25 Thread Erick Erickson

This seems like an XY problem, you're asking for
specifics on doing something without any indication
_why_ you think this would help. Nor are you explaining
what the problem you're having is in the first place.

At any rate, queryResultCache is unlikely to impact
much. All it is is a map containing the query and
the first few document IDs (internal Lucene). See
queryResultWindowSize in solrconfig.xml. It is
quite light-weight, it does NOT store the entire
result set, nor even the contents of the documents.

Best
Erick


On Tue, Feb 25, 2014 at 6:07 AM, Senthilnathan Vijayaraja 
senthilnat...@8kmiles.com wrote:

 is there any way programmatically disable/enable solr queryResultCache?

 I am using SolrJ.


 Thanks  Regards,
 Senthilnathan V

SolrCloud: How to replicate shard of another machine for failover?

2014-02-25 Thread Oliver Schrenk

Hi,

tldr: I have troubles configuring SolrCloud 4.3.1 to replicate the shard of 
another machine. Basically what it boils down is the question how to tell on 
solr instance to replicate the shard of another machine. I though that the 
system property `-Dshard=2` will do the trick but it doesn't do anything.

What to do?


---

I want the following setup


   leader.host_1:7070
  /
shard1
   /  \   
  /replica.host_2:7071
collection 
  \leader.host_2:7070
   \  /  
shard2
  \   
   replica.host_1:7071

I want to run two logical instances (leader  replica) of Solr on each physical 
machine (host_1  host_2).



Everything is running but the shard is replicated on the same physical machine!
Which doesn't work as a failover mechanism. So at the moment the layout is as
follows:

   leader.host_1:7070
  /
shard1
   /  \   
  /replica.host_1:7071
collection 
  \leader.host_2:7070
   \  /  
shard2
  \   
   replica.host_2:7071

I basically run the following commands on each machine. First on host_1

host_1$ java -Djetty.home=/opt/solr -DnumShards=2 
-Dcollection.configName=solrconfig.xml 
-DzkHost=localhost:2181 -Djetty.port=7070 
-Dsolr.solr.home=/opt/solr -Dbootstrap_confdir=conf -cp classpath

host_1$ java -Djetty.home=/opt/solr-replica-1 -DnumShards=2 
-Dshard=shard2 -Dcollection.configName=solrconfig.xml 
-DzkHost=localhost:2181 -Djetty.port=7071 
-Dsolr.solr.home=/opt/solr-replica-1 -Dbootstrap_confdir=conf -cp 
classpath

Then on host_2

host_2$ java -Djetty.home=/opt/solr -DnumShards=2 
-Dcollection.configName=solrconfig.xml 
-DzkHost=localhost:2181 -Djetty.port=7070 
-Dsolr.solr.home=/opt/solr -Dbootstrap_confdir=conf -cp classpath

host_2$ java -Djetty.home=/opt/solr-replica-1 -DnumShards=2 
-Dshard=shard1 -Dcollection.configName=solrconfig.xml 
-DzkHost=localhost:2181 -Djetty.port=7071 
-Dsolr.solr.home=/opt/solr-replica-1 -Dbootstrap_confdir=conf -cp 
classpath



Am I using the wrong configuration parameter? Is this behaviour possible 
(with Solr 4.3)?


Best regards
Oliver

Re: SolrCloud: How to replicate shard of another machine for failover?

2014-02-25 Thread Greg Walters

Oliver,

You'll probably have better luck not supplying CLI arguments and creating your 
collection via the collections api 
(https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateaCollection).
 Try removing -DnumShards and setting the -Dcollection.configName to something 
abstract such as collection1 rather than solrconfig.xml as you'll actually 
end up creating a directory in zookeeper called solrconfig.xml which can get 
confusing. Something like:

http://localhost:7071/solr/admin/collections?action=CREATEname=collection1numShards=2replicationFactor=2maxShardsPerNode=2collection.configName=collection1

should fit what you're trying to accomplish.

Thanks,
Greg

On Feb 25, 2014, at 9:09 AM, Oliver Schrenk oliver.schr...@gmail.com wrote:

 Hi,
 
 tldr: I have troubles configuring SolrCloud 4.3.1 to replicate the shard of 
 another machine. Basically what it boils down is the question how to tell on 
 solr instance to replicate the shard of another machine. I though that the 
 system property `-Dshard=2` will do the trick but it doesn't do anything.
 
 What to do?
 
 
 ---
 
 I want the following setup
 
 
  leader.host_1:7070
 /
   shard1
  /  \   
 /replica.host_2:7071
   collection 
 \leader.host_2:7070
  \  /  
   shard2
 \   
  replica.host_1:7071
 
 I want to run two logical instances (leader  replica) of Solr on each 
 physical 
 machine (host_1  host_2).
 
 
 
 Everything is running but the shard is replicated on the same physical 
 machine!
 Which doesn't work as a failover mechanism. So at the moment the layout is as
 follows:
 
  leader.host_1:7070
 /
   shard1
  /  \   
 /replica.host_1:7071
   collection 
 \leader.host_2:7070
  \  /  
   shard2
 \   
  replica.host_2:7071
 
 I basically run the following commands on each machine. First on host_1
 
   host_1$ java -Djetty.home=/opt/solr -DnumShards=2 
   -Dcollection.configName=solrconfig.xml 
   -DzkHost=localhost:2181 -Djetty.port=7070 
   -Dsolr.solr.home=/opt/solr -Dbootstrap_confdir=conf -cp classpath
 
   host_1$ java -Djetty.home=/opt/solr-replica-1 -DnumShards=2 
   -Dshard=shard2 -Dcollection.configName=solrconfig.xml 
   -DzkHost=localhost:2181 -Djetty.port=7071 
   -Dsolr.solr.home=/opt/solr-replica-1 -Dbootstrap_confdir=conf -cp 
 classpath
 
 Then on host_2
 
   host_2$ java -Djetty.home=/opt/solr -DnumShards=2 
   -Dcollection.configName=solrconfig.xml 
   -DzkHost=localhost:2181 -Djetty.port=7070 
   -Dsolr.solr.home=/opt/solr -Dbootstrap_confdir=conf -cp classpath
 
   host_2$ java -Djetty.home=/opt/solr-replica-1 -DnumShards=2 
   -Dshard=shard1 -Dcollection.configName=solrconfig.xml 
   -DzkHost=localhost:2181 -Djetty.port=7071 
   -Dsolr.solr.home=/opt/solr-replica-1 -Dbootstrap_confdir=conf -cp 
 classpath
 
 
 
 Am I using the wrong configuration parameter? Is this behaviour possible 
 (with Solr 4.3)?
 
 
 Best regards
 Oliver

Re: SolrCloud: How to replicate shard of another machine for failover?

2014-02-25 Thread Oliver Schrenk

I don’t actually run these commands. Everything is written down in either 
jetty.conf or solr.xml. I basically copy-pasted the output from a `ps -ef | 
grep solr`.

Is the Collections API the only way to do so? At the moment this is a proof of 
concept, but for going to production and I want to put this into puppet and I 
would feel more comfortable using configuration files than making a call to a 
webservice.


On 25 Feb 2014, at 16:19, Greg Walters greg.walt...@answers.com wrote:

 Oliver,
 
 You'll probably have better luck not supplying CLI arguments and creating 
 your collection via the collections api 
 (https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateaCollection).
  Try removing -DnumShards and setting the -Dcollection.configName to 
 something abstract such as collection1 rather than solrconfig.xml as 
 you'll actually end up creating a directory in zookeeper called 
 solrconfig.xml which can get confusing. Something like:
 
 http://localhost:7071/solr/admin/collections?action=CREATEname=collection1numShards=2replicationFactor=2maxShardsPerNode=2collection.configName=collection1
 
 should fit what you're trying to accomplish.
 
 Thanks,
 Greg
 
 On Feb 25, 2014, at 9:09 AM, Oliver Schrenk oliver.schr...@gmail.com wrote:
 
 Hi,
 
 tldr: I have troubles configuring SolrCloud 4.3.1 to replicate the shard of 
 another machine. Basically what it boils down is the question how to tell on 
 solr instance to replicate the shard of another machine. I though that the 
 system property `-Dshard=2` will do the trick but it doesn't do anything.
 
 What to do?
 
 
 ---
 
 I want the following setup
 
 
 leader.host_1:7070
/
  shard1
 /  \   
/replica.host_2:7071
  collection 
\leader.host_2:7070
 \  /  
  shard2
\   
 replica.host_1:7071
 
 I want to run two logical instances (leader  replica) of Solr on each 
 physical 
 machine (host_1  host_2).
 
 
 
 Everything is running but the shard is replicated on the same physical 
 machine!
 Which doesn't work as a failover mechanism. So at the moment the layout is as
 follows:
 
 leader.host_1:7070
/
  shard1
 /  \   
/replica.host_1:7071
  collection 
\leader.host_2:7070
 \  /  
  shard2
\   
 replica.host_2:7071
 
 I basically run the following commands on each machine. First on host_1
 
  host_1$ java -Djetty.home=/opt/solr -DnumShards=2 
  -Dcollection.configName=solrconfig.xml 
  -DzkHost=localhost:2181 -Djetty.port=7070 
  -Dsolr.solr.home=/opt/solr -Dbootstrap_confdir=conf -cp classpath
 
  host_1$ java -Djetty.home=/opt/solr-replica-1 -DnumShards=2 
  -Dshard=shard2 -Dcollection.configName=solrconfig.xml 
  -DzkHost=localhost:2181 -Djetty.port=7071 
  -Dsolr.solr.home=/opt/solr-replica-1 -Dbootstrap_confdir=conf -cp 
 classpath
 
 Then on host_2
 
  host_2$ java -Djetty.home=/opt/solr -DnumShards=2 
  -Dcollection.configName=solrconfig.xml 
  -DzkHost=localhost:2181 -Djetty.port=7070 
  -Dsolr.solr.home=/opt/solr -Dbootstrap_confdir=conf -cp classpath
 
  host_2$ java -Djetty.home=/opt/solr-replica-1 -DnumShards=2 
  -Dshard=shard1 -Dcollection.configName=solrconfig.xml 
  -DzkHost=localhost:2181 -Djetty.port=7071 
  -Dsolr.solr.home=/opt/solr-replica-1 -Dbootstrap_confdir=conf -cp 
 classpath
 
 
 
 Am I using the wrong configuration parameter? Is this behaviour possible 
 (with Solr 4.3)?
 
 
 Best regards
 Oliver

Re: SolrCloud: How to replicate shard of another machine for failover?

2014-02-25 Thread Shawn Heisey

On 2/25/2014 8:09 AM, Oliver Schrenk wrote:
 I want to run two logical instances (leader  replica) of Solr on each 
 physical 
 machine (host_1  host_2).
 
 Everything is running but the shard is replicated on the same physical 
 machine!
 Which doesn't work as a failover mechanism. So at the moment the layout is as
 follows:

Don't run multiple instances of Solr on one machine.  Instead, run one
instance per machine and create the collection with the maxShardsPerNode
parameter set to 2 or whatever value you need.  Running multiple
instances is a waste of memory, and Solr is perfectly capable of running
multiple indexes (cores) on one instance.  When there is one Solr
instance per machine, SolrCloud will never put replicas on the same
machine unless you specifically build them that way with the CoreAdmin API.

The way you've set it up, SolrCloud just sees that you have four Solr
instances.  It does not know that they are on the same machine.  As far
as it is concerned, they are entirely separate.

You might think that it should be able to see that they have the same
hostname or IP address, but if we checked for that, we would lose a
*lot* of flexibility that users demand.  It would be impossible to set
up test instances where they are all on the same machine.  There are
probably other networking scenarios that wouldn't function properly.

Something that would be a good idea is an optional config flag that
would make SolrCloud compare hostnames when building a collection and
avoid putting replicas on nodes where the hostname matches.  Whether to
default this option to on or off is a whole separate discussion.

Yet another whole separate discussion: You need three physical nodes for
a redundant zookeeper, but I see only one host (localhost) in your
zkHost parameter.

Thanks,
Shawn

CollapseQParserPlugin problem with ElevateComponent

2014-02-25 Thread dboychuck

https://issues.apache.org/jira/browse/SOLR-5773

I am having trouble with CollapseQParserPlugin showing duplicate groups when
the search results contain a member of a grouped document but another member
of that grouped document is defined in the elevate component. I have
described the issue in more detail here:
https://issues.apache.org/jira/browse/SOLR-5773

Any help is appreciated. Also any hints as to how I can solve this problem
myself would be great as I'm having a bit of trouble understanding the code
to implement a fix.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/CollapseQParserPlugin-problem-with-ElevateComponent-tp4119596.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud: How to replicate shard of another machine for failover?

2014-02-25 Thread Furkan KAMACI

Hi;

There is a round robin process when assigning nodes at cluster. If you want
to achieve what you want you should change your Solr start up order.

Thanks;
Furkan KAMACI


2014-02-25 19:17 GMT+02:00 Shawn Heisey s...@elyograg.org:

 On 2/25/2014 8:09 AM, Oliver Schrenk wrote:
  I want to run two logical instances (leader  replica) of Solr on each
 physical
  machine (host_1  host_2).
 
  Everything is running but the shard is replicated on the same physical
 machine!
  Which doesn't work as a failover mechanism. So at the moment the layout
 is as
  follows:

 Don't run multiple instances of Solr on one machine.  Instead, run one
 instance per machine and create the collection with the maxShardsPerNode
 parameter set to 2 or whatever value you need.  Running multiple
 instances is a waste of memory, and Solr is perfectly capable of running
 multiple indexes (cores) on one instance.  When there is one Solr
 instance per machine, SolrCloud will never put replicas on the same
 machine unless you specifically build them that way with the CoreAdmin API.

 The way you've set it up, SolrCloud just sees that you have four Solr
 instances.  It does not know that they are on the same machine.  As far
 as it is concerned, they are entirely separate.

 You might think that it should be able to see that they have the same
 hostname or IP address, but if we checked for that, we would lose a
 *lot* of flexibility that users demand.  It would be impossible to set
 up test instances where they are all on the same machine.  There are
 probably other networking scenarios that wouldn't function properly.

 Something that would be a good idea is an optional config flag that
 would make SolrCloud compare hostnames when building a collection and
 avoid putting replicas on nodes where the hostname matches.  Whether to
 default this option to on or off is a whole separate discussion.

 Yet another whole separate discussion: You need three physical nodes for
 a redundant zookeeper, but I see only one host (localhost) in your
 zkHost parameter.

 Thanks,
 Shawn

Autocommit, opensearchers and ingestion

2014-02-25 Thread Joel Cohen

Hi all,

I'm working with Solr 4.6.1 and I'm trying to tune my ingestion process.
The ingestion runs a big DB query and then does some ETL on it and inserts
via SolrJ.

I have a 4 node cluster with 1 shard per node running in Tomcat with
-Xmx=4096M. Each node has a separate instance of Zookeeper on it, plus the
ingestion server has one as well. The Solr servers have 8 cores and 64 Gb
of total RAM. The ingestion server is a VM with 8 Gb and 2 cores.

My ingestion code uses a few settings to control concurrency and batch size.

solr.update.batchSize=500
solr.threadCount=4

With this setup, I'm getting a lot of errors and the ingestion is taking
much longer than it should.

Every so often during the ingestion I get these errors on the Solr servers:

WARN  shard1 - 2014-02-25 11:18:34.341;
org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay
tlog{file=/usr/local/solr_shard1/productCatalog/data/tlog/tlog.0014074
refcount=2} active=true starting pos=776774
WARN  shard1 - 2014-02-25 11:18:37.275;
org.apache.solr.update.UpdateLog$LogReplayer; Log replay finished.
recoveryInfo=RecoveryInfo{adds=4065 deletes=0 deleteByQuery=0 errors=0
positionOfStart=776774}
WARN  shard1 - 2014-02-25 11:18:37.960; org.apache.solr.core.SolrCore;
[productCatalog] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
WARN  shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
[productCatalog] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later.
WARN  shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
[productCatalog] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later.
ERROR shard1 - 2014-02-25 11:18:37.961;
org.apache.solr.common.SolrException; org.apache.solr.common.SolrException:
Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try
again later.
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1575)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1346)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:592)

I cut threads down to 1 and batchSize down to 100 and the errors go away,
but the upload time jumps up by a factor of 25.

My solrconfig.xml has:

 autoCommit
   maxDocs${solr.autoCommit.maxDocs:1}/maxDocs
   maxTime${solr.autoCommit.maxTime:15000}/maxTime
   openSearcherfalse/openSearcher
 /autoCommit

 autoSoftCommit
   maxTime${solr.autoSoftCommit.maxTime:1000}/maxTime
 /autoSoftCommit

I turned autowarmCount down to 0 for all the caches. What else can I tune
to allow me to run bigger batch sizes and more threads in my upload script?

-- 

joel cohen, senior system engineer

e joel.co...@bluefly.com p 212.944.8000 x276
bluefly, inc. 42 w. 39th st. new york, ny 10018
www.bluefly.com http://www.bluefly.com/?referer=autosig | *fly since
2013...*

Re: Autocommit, opensearchers and ingestion

2014-02-25 Thread Furkan KAMACI

Hi;

You should read here:
http://wiki.apache.org/solr/FAQ#What_does_.22exceeded_limit_of_maxWarmingSearchers.3DX.22_mean.3F

On the other hand do you have 4 Zookeeper instances as a quorum?

Thanks;
Furkan KAMACI


2014-02-25 20:31 GMT+02:00 Joel Cohen joel.co...@bluefly.com:

 Hi all,

 I'm working with Solr 4.6.1 and I'm trying to tune my ingestion process.
 The ingestion runs a big DB query and then does some ETL on it and inserts
 via SolrJ.

 I have a 4 node cluster with 1 shard per node running in Tomcat with
 -Xmx=4096M. Each node has a separate instance of Zookeeper on it, plus the
 ingestion server has one as well. The Solr servers have 8 cores and 64 Gb
 of total RAM. The ingestion server is a VM with 8 Gb and 2 cores.

 My ingestion code uses a few settings to control concurrency and batch
 size.

 solr.update.batchSize=500
 solr.threadCount=4

 With this setup, I'm getting a lot of errors and the ingestion is taking
 much longer than it should.

 Every so often during the ingestion I get these errors on the Solr servers:

 WARN  shard1 - 2014-02-25 11:18:34.341;
 org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay

 tlog{file=/usr/local/solr_shard1/productCatalog/data/tlog/tlog.0014074
 refcount=2} active=true starting pos=776774
 WARN  shard1 - 2014-02-25 11:18:37.275;
 org.apache.solr.update.UpdateLog$LogReplayer; Log replay finished.
 recoveryInfo=RecoveryInfo{adds=4065 deletes=0 deleteByQuery=0 errors=0
 positionOfStart=776774}
 WARN  shard1 - 2014-02-25 11:18:37.960; org.apache.solr.core.SolrCore;
 [productCatalog] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
 WARN  shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
 [productCatalog] Error opening new searcher. exceeded limit of
 maxWarmingSearchers=2, try again later.
 WARN  shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
 [productCatalog] Error opening new searcher. exceeded limit of
 maxWarmingSearchers=2, try again later.
 ERROR shard1 - 2014-02-25 11:18:37.961;
 org.apache.solr.common.SolrException; org.apache.solr.common.SolrException:
 Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try
 again later.
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1575)
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1346)
 at

 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:592)

 I cut threads down to 1 and batchSize down to 100 and the errors go away,
 but the upload time jumps up by a factor of 25.

 My solrconfig.xml has:

  autoCommit
maxDocs${solr.autoCommit.maxDocs:1}/maxDocs
maxTime${solr.autoCommit.maxTime:15000}/maxTime
openSearcherfalse/openSearcher
  /autoCommit

  autoSoftCommit
maxTime${solr.autoSoftCommit.maxTime:1000}/maxTime
  /autoSoftCommit

 I turned autowarmCount down to 0 for all the caches. What else can I tune
 to allow me to run bigger batch sizes and more threads in my upload script?

 --

 joel cohen, senior system engineer

 e joel.co...@bluefly.com p 212.944.8000 x276
 bluefly, inc. 42 w. 39th st. new york, ny 10018
 www.bluefly.com http://www.bluefly.com/?referer=autosig | *fly since
 2013...*

Wildcard search not working if the query conatins numbers along with special characters.

2014-02-25 Thread Kashish

Hi,

I have a very weird problem. The wild card search works fine for all
scenarios but one. It doesn't seem to give any result for query 1999/99*. I
checked the debug query and its formed perfect.

str name=rawquerystringtitle_autocomplete:1999/99*/str
str name=querystringtitle_autocomplete:1999/99*/str
str name=parsedquery(+title_autocomplete:1999/99* ())/no_coord/str
str name=parsedquery_toString+title_autocomplete:1999/99* ()/str

This is my fieldType

 fieldType name=text_general_Title class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /

filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /

   
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

Please help we with this.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query conatins numbers along with special characters.

2014-02-25 Thread Ahmet Arslan

Hi Kashish,


What happens when you use this q={!prefix f=title_autocomplete}1999/99

I suspect '/' character is a special query parser character therefore it needs 
to be escaped.

Ahmet


On Tuesday, February 25, 2014 9:55 PM, Kashish itzz.me.kash...@gmail.com 
wrote:
Hi,

I have a very weird problem. The wild card search works fine for all
scenarios but one. It doesn't seem to give any result for query 1999/99*. I
checked the debug query and its formed perfect.

str name=rawquerystringtitle_autocomplete:1999/99*/str
str name=querystringtitle_autocomplete:1999/99*/str
str name=parsedquery(+title_autocomplete:1999/99* ())/no_coord/str
str name=parsedquery_toString+title_autocomplete:1999/99* ()/str

This is my fieldType

fieldType name=text_general_Title class=solr.TextField
positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
        
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.WhitespaceTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /

      
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
    /fieldType

Please help we with this.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocommit, opensearchers and ingestion

2014-02-25 Thread Gopal Patwa

This blog by Eric will help you to understand different commit option and
transaction logs and it does provide some recommendation for ingestion
process.

http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/


On Tue, Feb 25, 2014 at 11:40 AM, Furkan KAMACI furkankam...@gmail.comwrote:

 Hi;

 You should read here:

 http://wiki.apache.org/solr/FAQ#What_does_.22exceeded_limit_of_maxWarmingSearchers.3DX.22_mean.3F

 On the other hand do you have 4 Zookeeper instances as a quorum?

 Thanks;
 Furkan KAMACI


 2014-02-25 20:31 GMT+02:00 Joel Cohen joel.co...@bluefly.com:

  Hi all,
 
  I'm working with Solr 4.6.1 and I'm trying to tune my ingestion process.
  The ingestion runs a big DB query and then does some ETL on it and
 inserts
  via SolrJ.
 
  I have a 4 node cluster with 1 shard per node running in Tomcat with
  -Xmx=4096M. Each node has a separate instance of Zookeeper on it, plus
 the
  ingestion server has one as well. The Solr servers have 8 cores and 64 Gb
  of total RAM. The ingestion server is a VM with 8 Gb and 2 cores.
 
  My ingestion code uses a few settings to control concurrency and batch
  size.
 
  solr.update.batchSize=500
  solr.threadCount=4
 
  With this setup, I'm getting a lot of errors and the ingestion is taking
  much longer than it should.
 
  Every so often during the ingestion I get these errors on the Solr
 servers:
 
  WARN  shard1 - 2014-02-25 11:18:34.341;
  org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay
 
 
 tlog{file=/usr/local/solr_shard1/productCatalog/data/tlog/tlog.0014074
  refcount=2} active=true starting pos=776774
  WARN  shard1 - 2014-02-25 11:18:37.275;
  org.apache.solr.update.UpdateLog$LogReplayer; Log replay finished.
  recoveryInfo=RecoveryInfo{adds=4065 deletes=0 deleteByQuery=0 errors=0
  positionOfStart=776774}
  WARN  shard1 - 2014-02-25 11:18:37.960; org.apache.solr.core.SolrCore;
  [productCatalog] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
  WARN  shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
  [productCatalog] Error opening new searcher. exceeded limit of
  maxWarmingSearchers=2, try again later.
  WARN  shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
  [productCatalog] Error opening new searcher. exceeded limit of
  maxWarmingSearchers=2, try again later.
  ERROR shard1 - 2014-02-25 11:18:37.961;
  org.apache.solr.common.SolrException;
 org.apache.solr.common.SolrException:
  Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try
  again later.
  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1575)
  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1346)
  at
 
 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:592)
 
  I cut threads down to 1 and batchSize down to 100 and the errors go away,
  but the upload time jumps up by a factor of 25.
 
  My solrconfig.xml has:
 
   autoCommit
 maxDocs${solr.autoCommit.maxDocs:1}/maxDocs
 maxTime${solr.autoCommit.maxTime:15000}/maxTime
 openSearcherfalse/openSearcher
   /autoCommit
 
   autoSoftCommit
 maxTime${solr.autoSoftCommit.maxTime:1000}/maxTime
   /autoSoftCommit
 
  I turned autowarmCount down to 0 for all the caches. What else can I tune
  to allow me to run bigger batch sizes and more threads in my upload
 script?
 
  --
 
  joel cohen, senior system engineer
 
  e joel.co...@bluefly.com p 212.944.8000 x276
  bluefly, inc. 42 w. 39th st. new york, ny 10018
  www.bluefly.com http://www.bluefly.com/?referer=autosig | *fly since
  2013...*

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Kashish

Hi Ahmet,

Thanks for your reply.

Yes. I pass my query this way -  q=title_autocomplete:1999%2f99

I tried your way too. But no luck. :( 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119615.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocommit, opensearchers and ingestion

2014-02-25 Thread Erick Erickson

Gopal: I'm glad somebody noticed that blog!

Joel:
For bulk loads it's a Good Thing to lengthen out
your soft autocommit interval. A lot. Every second
poor Solr is trying to open up a new searcher while
you're throwing lots of documents at it. That's what's
generating the too many searchers problem I'd
guess. Soft commits are less expensive than hard
commits with openSearcher=true (you're not doing this,
and you shouldn't be). But soft commits aren't free.
All the top-level caches are thrown away and autowarming
is performed.

Also, I'd probably consider just leaving off the bit about
maxDocs in your hard commit, I find it rarely does all
that much good. After all, even if you have to replay the
transaction log, you're only talking 15 seconds here.

Best,
Erick

On Tue, Feb 25, 2014 at 12:08 PM, Gopal Patwa gopalpa...@gmail.com wrote:

This blog by Eric will help you to understand different commit option and
transaction logs and it does provide some recommendation for ingestion
process.

http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

On Tue, Feb 25, 2014 at 11:40 AM, Furkan KAMACI furkankam...@gmail.com
wrote:

Hi;

You should read here:

http://wiki.apache.org/solr/FAQ#What_does_.22exceeded_limit_of_maxWarmingSearchers.3DX.22_mean.3F

On the other hand do you have 4 Zookeeper instances as a quorum?

Thanks;
Furkan KAMACI

2014-02-25 20:31 GMT+02:00 Joel Cohen joel.co...@bluefly.com:

Hi all,

I'm working with Solr 4.6.1 and I'm trying to tune my ingestion
process.
The ingestion runs a big DB query and then does some ETL on it and
inserts
via SolrJ.

I have a 4 node cluster with 1 shard per node running in Tomcat with
-Xmx=4096M. Each node has a separate instance of Zookeeper on it, plus
the
ingestion server has one as well. The Solr servers have 8 cores and 64
Gb
of total RAM. The ingestion server is a VM with 8 Gb and 2 cores.

My ingestion code uses a few settings to control concurrency and batch
size.

solr.update.batchSize=500
solr.threadCount=4

With this setup, I'm getting a lot of errors and the ingestion is
taking
much longer than it should.

Every so often during the ingestion I get these errors on the Solr
servers:

WARN shard1 - 2014-02-25 11:18:34.341;
org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay

tlog{file=/usr/local/solr_shard1/productCatalog/data/tlog/tlog.0014074
refcount=2} active=true starting pos=776774
WARN shard1 - 2014-02-25 11:18:37.275;
org.apache.solr.update.UpdateLog$LogReplayer; Log replay finished.
recoveryInfo=RecoveryInfo{adds=4065 deletes=0 deleteByQuery=0 errors=0
positionOfStart=776774}
WARN shard1 - 2014-02-25 11:18:37.960; org.apache.solr.core.SolrCore;
[productCatalog] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
WARN shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
[productCatalog] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later.
WARN shard1 - 2014-02-25 11:18:37.961; org.apache.solr.core.SolrCore;
[productCatalog] Error opening new searcher. exceeded limit of
maxWarmingSearchers=2, try again later.
ERROR shard1 - 2014-02-25 11:18:37.961;
org.apache.solr.common.SolrException;
org.apache.solr.common.SolrException:
Error opening new searcher. exceeded limit of maxWarmingSearchers=2,
try
again later.
at
org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1575)
at
org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1346)
at

org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:592)

I cut threads down to 1 and batchSize down to 100 and the errors go
away,
but the upload time jumps up by a factor of 25.

My solrconfig.xml has:

autoCommit
maxDocs${solr.autoCommit.maxDocs:1}/maxDocs
maxTime${solr.autoCommit.maxTime:15000}/maxTime
openSearcherfalse/openSearcher
/autoCommit

autoSoftCommit
maxTime${solr.autoSoftCommit.maxTime:1000}/maxTime
/autoSoftCommit

I turned autowarmCount down to 0 for all the caches. What else can I
tune
to allow me to run bigger batch sizes and more threads in my upload
script?

joel cohen, senior system engineer

e joel.co...@bluefly.com p 212.944.8000 x276
bluefly, inc. 42 w. 39th st. new york, ny 10018
www.bluefly.com http://www.bluefly.com/?referer=autosig | *fly since
2013...*

Re: Wildcard search not working if the query conatins numbers along with special characters.

2014-02-25 Thread Erick Erickson

What does it say happens on your admin/analysis page
for that field?

And did you by any chance change your schema without
reindexing everything?

Also, try the TermsComonent to see what tokens are actually
_in_ your index. Schema-browser from the admin page can
help here too.

Best,
Erick


On Tue, Feb 25, 2014 at 12:05 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Kashish,


 What happens when you use this q={!prefix f=title_autocomplete}1999/99

 I suspect '/' character is a special query parser character therefore it
 needs to be escaped.

 Ahmet


 On Tuesday, February 25, 2014 9:55 PM, Kashish itzz.me.kash...@gmail.com
 wrote:
 Hi,

 I have a very weird problem. The wild card search works fine for all
 scenarios but one. It doesn't seem to give any result for query 1999/99*. I
 checked the debug query and its formed perfect.

 str name=rawquerystringtitle_autocomplete:1999/99*/str
 str name=querystringtitle_autocomplete:1999/99*/str
 str name=parsedquery(+title_autocomplete:1999/99* ())/no_coord/str
 str name=parsedquery_toString+title_autocomplete:1999/99* ()/str

 This is my fieldType

 fieldType name=text_general_Title class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /

 filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /


 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 Please help we with this.

 Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Ahmet Arslan

Hi,

By saying escaping I mean this : q=title_autocomplete:1999\/99*   It is 
different than URL encoding.

http://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters

If prefix query parser didn't return what you want then it must be something 
with indexed terms.

Can you give an example raw documents text that you expect to retrieve with 
this query?



On Tuesday, February 25, 2014 10:15 PM, Kashish itzz.me.kash...@gmail.com 
wrote:
Hi Ahmet,

Thanks for your reply.

Yes. I pass my query this way -  q=title_autocomplete:1999%2f99

I tried your way too. But no luck. :( 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119615.html

Sent from the Solr - User mailing list archive at Nabble.com.

excludeIds in QueryElevationComponent (4.7)

2014-02-25 Thread Lajos


Guys,

I've been testing out https://issues.apache.org/jira/browse/SOLR-5541 on 
4.7RC4.


I previously had an elevate.xml that elevated 3 documents for a specific 
query. My understanding is that I could, at runtime, exclude one of 
those. So I tried that like this:


http://localhost:8080/solr/ecommerce/search?q=canonexcludeIds=208464207

and now NONE of my documents are elevated. What I would have expected is 
that I'd have 2 elevated documents, but the 208464207 would not be 
amongst them.


Sadly, what happens is that now nothing is elevated.

Am I misunderstanding something or should I open a JIRA? Looking at the 
source code I can't immediately see what would be wrong.


Thanks,

Lajos

Re: excludeIds in QueryElevationComponent (4.7)

2014-02-25 Thread Lajos


Hit the send button too fast ...

What is seems that is happening is that excludeIds or elevateIds ignores 
what's in elevate.xml. I would have expected (hoped) that it would layer 
on top of that, which makes a bit more sense I think.


Thanks,

Lajos


On 25/02/2014 22:58, Lajos wrote:

Guys,

I've been testing out https://issues.apache.org/jira/browse/SOLR-5541 on
4.7RC4.

I previously had an elevate.xml that elevated 3 documents for a specific
query. My understanding is that I could, at runtime, exclude one of
those. So I tried that like this:

http://localhost:8080/solr/ecommerce/search?q=canonexcludeIds=208464207

and now NONE of my documents are elevated. What I would have expected is
that I'd have 2 elevated documents, but the 208464207 would not be
amongst them.

Sadly, what happens is that now nothing is elevated.

Am I misunderstanding something or should I open a JIRA? Looking at the
source code I can't immediately see what would be wrong.

Thanks,

Lajos

Re: excludeIds in QueryElevationComponent (4.7)

2014-02-25 Thread Chris Hostetter


: What is seems that is happening is that excludeIds or elevateIds ignores
: what's in elevate.xml. I would have expected (hoped) that it would layer on
: top of that, which makes a bit more sense I think.

That's not how it's implemented -- i believe Joel implemented this way 
intentional because otherwise if the elevate.xml said elevate A,B and 
exclude X,Y there would be no simple way to say instead of what's in 
elevate.xml, i want to elevate X,Y and i don't wnat to exclude *anything*

I made sure this was explicitly documented in the ref guide...

https://cwiki.apache.org/confluence/display/solr/The+Query+Elevation+Component#TheQueryElevationComponent-TheelevateIdsandexcludeIdsParameters

If either one of these parameters is specified at request time, the the 
entire elevation configuration for the query is ignored.



-Hoss
http://www.lucidworks.com/

Re: excludeIds in QueryElevationComponent (4.7)

2014-02-25 Thread Lajos


Thanks Hoss, that makes sense.

Anyway, I like the new paradigm better ... it allows for more 
intelligent elevation control.


Cheers,

L


On 25/02/2014 23:26, Chris Hostetter wrote:


: What is seems that is happening is that excludeIds or elevateIds ignores
: what's in elevate.xml. I would have expected (hoped) that it would layer on
: top of that, which makes a bit more sense I think.

That's not how it's implemented -- i believe Joel implemented this way
intentional because otherwise if the elevate.xml said elevate A,B and
exclude X,Y there would be no simple way to say instead of what's in
elevate.xml, i want to elevate X,Y and i don't wnat to exclude *anything*

I made sure this was explicitly documented in the ref guide...

https://cwiki.apache.org/confluence/display/solr/The+Query+Elevation+Component#TheQueryElevationComponent-TheelevateIdsandexcludeIdsParameters

If either one of these parameters is specified at request time, the the
entire elevation configuration for the query is ignored.



-Hoss
http://www.lucidworks.com/

Re: SolrCloud Startup

2014-02-25 Thread KNitin

Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
in 4.3.1). A few cores get initialized and few dont (show as just
recovering or down) and hence had to move away from it. Is it a known issue
in 4.3.1?

Shawn,Otis,Erick

 Yes I have reviewed the page before and have given 1/4 of my mem to JVM
and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
machine). I have also reviewed the tlog file and they are in the order of
KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
order of 100Kb during that time frame). I have also disabled swap on all
machines

Regarding firstSearcher, It is currently set to externalFileLoader. What is
the use of first searcher? I havent played around with it

Thanks
Nitin





On Mon, Feb 24, 2014 at 7:58 PM, Erick Erickson erickerick...@gmail.comwrote:

 What is your firstSearcher set to in solrconfig.xml? If you're
 doing something really crazy there that might be an issue.

 But I think Otis' suggestion is a lot more probable. What
 are your autocommits configured to?

 Best,
 Erick


 On Mon, Feb 24, 2014 at 7:41 PM, Shawn Heisey s...@elyograg.org wrote:

   Hi
  
I have a 4 node solrcloud cluster with more than 50 collections with 4
   shards each. Everytime I want to make a schema change, I upload configs
  to
   zookeeper and then restart all nodes. However the restart of every node
  is
   very slow and takes about 20-30 minutes per node.
  
   Is it recommended to make loadOnStartup=false and allow solrcloud to
 lazy
   load? Is there a way to make schema changes without restarting
 solrcloud?
 
  I'm on my phone so getting a Url for you is hard. Search the wiki for
  SolrPerformanceProblems. There's a section there on slow startup.
 
  If that's not it, it's probably not enough RAM for the OS disk cache.
 That
  is also discussed on that wiki page.
 
  Thanks,
  Shawn

Re: SolrCloud Startup

2014-02-25 Thread KNitin

Erick: My autocommit is set to trigger every 30 seconds with
openSearcher=false. The autocommit for soft commits are disabled


On Tue, Feb 25, 2014 at 3:30 PM, KNitin nitin.t...@gmail.com wrote:

 Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
 in 4.3.1). A few cores get initialized and few dont (show as just
 recovering or down) and hence had to move away from it. Is it a known issue
 in 4.3.1?

 Shawn,Otis,Erick

  Yes I have reviewed the page before and have given 1/4 of my mem to JVM
 and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
 machine). I have also reviewed the tlog file and they are in the order of
 KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
 order of 100Kb during that time frame). I have also disabled swap on all
 machines

 Regarding firstSearcher, It is currently set to externalFileLoader. What
 is the use of first searcher? I havent played around with it

 Thanks
 Nitin





 On Mon, Feb 24, 2014 at 7:58 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 What is your firstSearcher set to in solrconfig.xml? If you're
 doing something really crazy there that might be an issue.

 But I think Otis' suggestion is a lot more probable. What
 are your autocommits configured to?

 Best,
 Erick


 On Mon, Feb 24, 2014 at 7:41 PM, Shawn Heisey s...@elyograg.org wrote:

   Hi
  
I have a 4 node solrcloud cluster with more than 50 collections with
 4
   shards each. Everytime I want to make a schema change, I upload
 configs
  to
   zookeeper and then restart all nodes. However the restart of every
 node
  is
   very slow and takes about 20-30 minutes per node.
  
   Is it recommended to make loadOnStartup=false and allow solrcloud to
 lazy
   load? Is there a way to make schema changes without restarting
 solrcloud?
 
  I'm on my phone so getting a Url for you is hard. Search the wiki for
  SolrPerformanceProblems. There's a section there on slow startup.
 
  If that's not it, it's probably not enough RAM for the OS disk cache.
 That
  is also discussed on that wiki page.
 
  Thanks,
  Shawn

Re: SolrCloud Startup

2014-02-25 Thread Shawn Heisey


On 2/25/2014 4:30 PM, KNitin wrote:

Jeff :  Thanks. I have tried reload before but it is not reliable (atleast
in 4.3.1). A few cores get initialized and few dont (show as just
recovering or down) and hence had to move away from it. Is it a known issue
in 4.3.1?


With Solr 4.3.1, you are running into this bug with reloads under SolrCloud:

https://issues.apache.org/jira/browse/SOLR-4805

The only way to recover from this bug is to restart Solr.The bug is 
fixed in 4.4.0 and later.



Shawn,Otis,Erick

  Yes I have reviewed the page before and have given 1/4 of my mem to JVM
and the rest to RAM/Os Cache. (15 Gb heap and 45 G to rest. Totally 60G
machine). I have also reviewed the tlog file and they are in the order of
KB (4-10 or 30). I have SSD and the reads are hardly noticable (in the
order of 100Kb during that time frame). I have also disabled swap on all
machines

Regarding firstSearcher, It is currently set to externalFileLoader. What is
the use of first searcher? I havent played around with it


I don't think it's a good idea to have extensive warming queries.  I do 
exactly one query in firstSearcher and newSearcher: a query for all 
documents with zero rows, sorted on our most common sort field.  This is 
designed purely to preload the sort data into the FieldCache.


Thanks,
Shawn

Re: CollapseQParserPlugin problem with ElevateComponent

2014-02-25 Thread Joel Bernstein

Hi David,

Just read through your comments on the jira. Feel free to create a jira for
this. The way this currently works is that if the elevated document is not
the selected group head, then both the elevated document and the group head
are in the result set. What you are suggesting is that the elevated
document becomes the group head. We can discuss the best way to handle this
on the new ticket.

Joel

Joel Bernstein
Search Engineer at Heliosearch


On Tue, Feb 25, 2014 at 1:29 PM, dboychuck dboych...@build.com wrote:

 https://issues.apache.org/jira/browse/SOLR-5773

 I am having trouble with CollapseQParserPlugin showing duplicate groups
 when
 the search results contain a member of a grouped document but another
 member
 of that grouped document is defined in the elevate component. I have
 described the issue in more detail here:
 https://issues.apache.org/jira/browse/SOLR-5773

 Any help is appreciated. Also any hints as to how I can solve this problem
 myself would be great as I'm having a bit of trouble understanding the code
 to implement a fix.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/CollapseQParserPlugin-problem-with-ElevateComponent-tp4119596.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Kashish

Hi Ahmet/Erick,

I tried escaping as well. See no luck.

The title am looking for is  - ARABIAN NIGHTS #01 (1999/99)

I figured out that if i pass the query as *1999/99* (i.e asterisk not only
at the end but at the beginning as well), It works.

The problem is the braces. I can change my field type and add 

 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1 preserveOriginal=1/

But this will show too many results in autocomplete.

Is there any best way to handle this? Or should i pass asterisk before and
after the query?

Thanks.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119678.html
Sent from the Solr - User mailing list archive at Nabble.com.

XML with duplicate element names

2014-02-25 Thread cju

I'm trying to query XML documents stored in Riak 2.0, which has integrated
Solr. My XML looks like this.

MainData
  Info
Info name=Bob city=Columbus /
Info name=Joe city=Cincinnati /
  /Info
/MainData

So a search in Riak might look something like this:

q=MainData.Info.Info@name:Bob

So let's say I want to match all documents where the name=Bob and
city=Cincinnati, for same element ... If I do something like the
following:

q=MainData.Info.Info@name:Bob AND MainData.Info.Info@city:Cincinnati

I'll get a hit - even though that's not what I'm really looking for - I want
Bob and Cincinnati matching in the same Info element. 

So if you take my example XML at the top of my post here, how would I write
the query to match a document where the MainData.Info.Info element has the
attributes name=Joe and city=Cincinnati ... or the following line:

Info name=Joe city=Cincinnati /

I did try a fq, that looked like this, figuring I could filter down to the
element where name=Joe, then test to see if city=Cincinnati - that it
didn't work:

q=MainData.Info.Info@city:Cincinnatifq=MainData.Info.Info@name:Joe

I'm obviously a noob here, so I apologize for my noobness in advance.
Thanks!




--
View this message in context: 
http://lucene.472066.n3.nabble.com/XML-with-duplicate-element-names-tp4119679.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CollapseQParserPlugin problem with ElevateComponent

2014-02-25 Thread dboychuck

Hi Joel,

Thank you for the reply. I created
https://issues.apache.org/jira/browse/SOLR-5773 for this new feature. I was
looking at the getBoostDocs() function and if I understand it correctly it
iterates through the boosted set that is passed into the function and then
iterates over the boosted SetString and casts each element to a BytesRef
and stores it in a HashSet. While I'm confused what all the type conversion
is actually doing I can follow the logic somewhat. You then traverse the
index and retrieve all of the terms for the uniqueid of the schema. You
then seek in the localBoosts hashset for the current document and if it is
in the index you add it to boostDocs to be returned from the function as
well as remove the document from the localBoosts hashset. I don't think the
document is actually removed from the result set in this function however.
I spent some hours today trying to decipher some of this code. I am very
interested in understanding this code so that I can contribute back to this
project but I am finding it all a bit daunting. As always your help is
greatly appreciated and thank you for the quick response.



On Tue, Feb 25, 2014 at 5:38 PM, Joel Bernstein [via Lucene] 
ml-node+s472066n411966...@n3.nabble.com wrote:

 Hi David,

 Just read through your comments on the jira. Feel free to create a jira
 for
 this. The way this currently works is that if the elevated document is not
 the selected group head, then both the elevated document and the group
 head
 are in the result set. What you are suggesting is that the elevated
 document becomes the group head. We can discuss the best way to handle
 this
 on the new ticket.

 Joel

 Joel Bernstein
 Search Engineer at Heliosearch


 On Tue, Feb 25, 2014 at 1:29 PM, dboychuck [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4119662i=0
 wrote:

  https://issues.apache.org/jira/browse/SOLR-5773
 
  I am having trouble with CollapseQParserPlugin showing duplicate groups
  when
  the search results contain a member of a grouped document but another
  member
  of that grouped document is defined in the elevate component. I have
  described the issue in more detail here:
  https://issues.apache.org/jira/browse/SOLR-5773
 
  Any help is appreciated. Also any hints as to how I can solve this
 problem
  myself would be great as I'm having a bit of trouble understanding the
 code
  to implement a fix.
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/CollapseQParserPlugin-problem-with-ElevateComponent-tp4119596.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/CollapseQParserPlugin-problem-with-ElevateComponent-tp4119596p4119662.html
  To unsubscribe from CollapseQParserPlugin problem with ElevateComponent, 
 click
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4119596code=ZGJveWNodWNrQGJ1aWxkLmNvbXw0MTE5NTk2fDE3MjM1MzAyMTk=
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml




-- 
*David Boychuck*
Software Engineer Search, Team Lead
Build.com, Inc.  http://corp.build.com/
Smarter Home Improvement(tm)
P.O. Box 7990 Chico, CA 95927
*P*: 800.375.3403
*F*: 530.566.1893
dboych...@build.com | Network of
Storeshttp://www.build.com/index.cfm?page=help:networkstoressource=emailSignature




--
View this message in context: 
http://lucene.472066.n3.nabble.com/CollapseQParserPlugin-problem-with-ElevateComponent-tp4119596p4119680.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Wildcard search not working if the query contains numbers along with special characters.

2014-02-25 Thread Erick Erickson

The admin/analysis page is your friend. Taking some time to
get acquainted with that page will save you lots and lots and
lots of time. In this case, you'd have seen that your input
is actually tokenized as (1999/99), parentheses and all as a
_single_ token, so of course searching for 1999/99 wouldn't work.

Searching for *1999/99* is generally a bad idea. It'll work, but it's
a kludge.

What you _do_ need to do is define your use-cases. Let's
assume that you _never_ want parentheses to be relevant. You
could use PatternReplaceCharFilterFactory or PatternReplaceFilterFactory
in both index and query parts of your analysis chain to remove
parens. Or really any kinds of extraneous characters you decided
were unimportant.

But you need to decide what's important and enforce that.

Best,
Erick

On Tue, Feb 25, 2014 at 7:28 PM, Kashish itzz.me.kash...@gmail.com wrote:

Hi Ahmet/Erick,

I tried escaping as well. See no luck.

The title am looking for is - ARABIAN NIGHTS #01 (1999/99)

I figured out that if i pass the query as *1999/99* (i.e asterisk not only
at the end but at the beginning as well), It works.

The problem is the braces. I can change my field type and add

filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1 preserveOriginal=1/

But this will show too many results in autocomplete.

Is there any best way to handle this? Or should i pass asterisk before and
after the query?

Thanks.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-if-the-query-conatins-numbers-along-with-special-characters-tp4119608p4119678.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: programmatically disable/enable solr queryResultCache...

2014-02-25 Thread Senthilnathan Vijayaraja

Erick,
 Thanks for the response.

Kindly have a look at my sample query,

select?fl=city,$scoreq=*:*fq={!lucene q.op=OR df=city v=$cit}cit=Chennai

*sort=$score desc score=norm($la,value,10) la=8 b=1c=2*here,
score= norm($la,value,10), norm  is a custom function

*,if I change la then the $score will change.*
first time it work fine but if I am changing la alone and firing the query
again the result remains in the same order as first query result.Which
means sorting is not happening even the score is different.But If I am
changing the cit=Chennai to cit=someCity  then I am getting result in
proper order,means sorting works fine.

At any rate, queryResultCache is unlikely to impact
much. All it is is
*a map containing the query and the first few document IDs *(internal
Lucene).

which means query is the unique key and list of document ids are values
mapped with that key.If I am not wrong,

may I know how solr builds the unique keys based on the queries.

Whether it builds the key based on only solr common query parameters or it
will include all the parameters supplied by user as part of query(for e.g
la=8b=1c=2 ).


any clue?


Thanks  Regards,
Senthilnathan V


On Tue, Feb 25, 2014 at 8:00 PM, Erick Erickson erickerick...@gmail.comwrote:

 This seems like an XY problem, you're asking for
 specifics on doing something without any indication
 _why_ you think this would help. Nor are you explaining
 what the problem you're having is in the first place.

 At any rate, queryResultCache is unlikely to impact
 much. All it is is a map containing the query and
 the first few document IDs (internal Lucene). See
 queryResultWindowSize in solrconfig.xml. It is
 quite light-weight, it does NOT store the entire
 result set, nor even the contents of the documents.

 Best
 Erick


 On Tue, Feb 25, 2014 at 6:07 AM, Senthilnathan Vijayaraja 
 senthilnat...@8kmiles.com wrote:

  is there any way programmatically disable/enable solr queryResultCache?
 
  I am using SolrJ.
 
 
  Thanks  Regards,
  Senthilnathan V

39 matches

Mail list logo