Re: Solr Update URI is not found

2013-10-28 Thread Raymond Wiker

On 28 Oct 2013, at 01:19 , Bayu Widyasanyata bwidyasany...@gmail.com wrote:

 request: http://localhost:8080/solr/update?wt=javabinversion=2

I think this url is incorrect: there should be a core name between solr and 
update.

Re: Solr Update URI is not found

2013-10-28 Thread Bayu Widyasanyata
On Mon, Oct 28, 2013 at 1:26 PM, Raymond Wiker rwi...@gmail.com wrote:

  request: http://localhost:8080/solr/update?wt=javabinversion=2

 I think this url is incorrect: there should be a core name between solr
 and update.


I changed th SolrURL on crawl script's option to:

./bin/crawl urls/seed.txt TestCrawl http://localhost:8080/solr/mycollection/2

And the result now is Bad Request.
I will look for another misconfiguration things...

=

org.apache.solr.common.SolrException: Bad Request

Bad Request

request: http://localhost:8080/solr/mycollection/update?wt=javabinversion=2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
at
org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
at
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:535)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
2013-10-28 13:30:02,804 ERROR indexer.IndexingJob - Indexer:
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)



-- 
wassalam,
[bayu]


One of all shard stopping, all shards stop

2013-10-28 Thread hongkeun.yoo
Hi. 

I hava 3 shard solr cloud version 4.4.0 not replication.
http://lucene.472066.n3.nabble.com/file/n4098015/ex1.png 

for example, if one shard(leader) died for OOM, all shard is stop.

is it just the way that it is?
I want to find a option this problem.
I want to change 
if 1 shard died, remain shards request work nomally

thanks you.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/One-of-all-shard-stopping-all-shards-stop-tp4098015.html
Sent from the Solr - User mailing list archive at Nabble.com.


Optimal interval for soft commit

2013-10-28 Thread Mugoma Joseph O.
Hello,

We have solr index with about 1m docs.

Every day we add 5,000 to 8,000 docs.

We have defined 15 sec interval for soft commit. But for the impatient
user 15 secs looks like eternity.

The wiki http://wiki.apache.org/solr/NearRealtimeSearch advises on 1s soft
commit interval but warns Be sure to pay special attention to cache and
autowarm settings as they can have a significant impact on NRT
performance

I was looking at CommitWithin (http://wiki.apache.org/solr/CommitWithin,
http://stackoverflow.com/questions/17475456/solr-issues-with-soft-auto-commit-near-real-time)
as an alternative but have no idea how this works and the implications

What would be best settings to achieve NRT search?

Thanks.

Mugoma.




Compound words

2013-10-28 Thread Parvesh Garg
Hi,

I'm an infant in Solr/Lucene family, just a couple of months old.

We are trying to find a way to combine words into a single compound word at
index and query time. E.g. if the document has sea bird in it, it should
be indexed as seabird and any query having sea bird in it should also look
for seabird not only in qf but also in pf, pf2, pf3 fields. Well, we are
using edismax query parser.

Our problem is not at index time, we have achieved it by writing our own
token filter, but at query time. Our token filter takes a dictionary in the
form of prefix,suffix in the file and keeps emitting regular and compound
tokens as it encounters them.

We configured our own filter at query time but figured that at query time
individual clauses like field:sea , field:bird etc are created first and
then sent to the analyzer. First of all, can someone please confirm if this
part of my understanding is correct? So, we are forced to emit sea and bird
as individual tokens because we are not getting them in sequence at all.

Is it possible to achieve this by other means than pre-processing query
before sending it to solr? Can a CharFilter be used instead, are they
applied before creating query clauses?

I can keep providing more details as necessary. This mail has already
crossed TL;DR limits for many :)

Parvesh Garg
http://www.zettata.com
+91 963 222 5540


Re: Compound words

2013-10-28 Thread Parvesh Garg
One more thing, Is there a way to remove my accidentally sent phone number
in the signature from the previous mail? aarrrggghhh


Re: SolrCloud: optimizing a core triggers optimizations of all cores in that collection?

2013-10-28 Thread michael.boom
Thanks @Mark  @Erick

Should I create a JIRA issue for this ?



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-optimizing-a-core-triggers-optimization-of-another-tp4097499p4098020.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr search in case the first keyword are not index

2013-10-28 Thread dtphat
I have solve it.
Thanks.



-
Phat T. Dong
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-search-in-case-the-first-keyword-are-not-index-tp4097699p4098021.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Optimal interval for soft commit

2013-10-28 Thread michael.boom
How do you add the documents to the index - one by one, batches of n ? When
do you do your commits ?
Because 8k docs per day is not a lot. Depending on the above, commiting with
softCommit=true might also be a solution.



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Optimal-interval-for-soft-commit-tp4098016p4098022.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: One of all shard stopping, all shards stop

2013-10-28 Thread michael.boom
When one of your shards dies, your index becomes incomplete. By default the
querying is distributed (on all shards - distrib=true) and if one of them
(shard X) is down, then you get an error stating that there are no servers
hosting shard X.

If the other shards are still up you can query them directly using
distrib=false but in the resultset you will only have documents from that
shard. So you would have to query every active shard individually and then
merge the results yourself.
If I'm wrong please correct me.



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/One-of-all-shard-stopping-all-shards-stop-tp4098015p4098024.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr For

2013-10-28 Thread michael.boom
You're describing two different entities: Job and Employee.
Since they are clearly different in any way you will need two different
cores with two different schemas.



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-For-tp4097928p4098025.html
Sent from the Solr - User mailing list archive at Nabble.com.


Data import handler with multi tables

2013-10-28 Thread dtphat
Hi,
I wanna to import many tables from MySQL. Assume that, I have two tables:
*** Tables 1: tbl_tableA(id, nameA) with data (1, A1), (2, A2), (3, A3).
*** Tables 2: tbl_tableB(id, nameB) with data (1, B1), (2, B2), (3, B3), (4,
B4), (5, B5).

I configure:
dataConfig
dataSource type=JdbcDataSource 
driver=com.mysql.jdbc.Driver 
url=jdbc:mysql://xx 
user=xxx password=xxx batchSize=1 /

    document name = atexpats6

entity name=tableA 
query=select * from tbl_tableA
field name=id column=id/
field name=nameA column=nameA /
/entity


entity name=tableB 
query=select * from tbl_tableB
field name=id column=id/
field name=nameA column=nameA /
/entity
    /document
/dataConfig

I define nameA, nameB in schema.xml and id is configured by
uniqueKeyid/uniqueKey

When I import data by
http://localhost:8983/solr/dataimport?command=full-import

It's successfull. But only data of tbl_tableB had indexed.

I think  because id is unique. When importing tbl_tableA import first,
tbl_tableB import after. tbl_tableB has id which the same id in tableA, so
only data of tableB had indexed with unique id.

Anyone can help me to configure data import handler that can index all data
of two (more) tables which have the same id in each table.

Thanks.



-
Phat T. Dong
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Data-import-handler-with-multi-tables-tp4098026.html
Sent from the Solr - User mailing list archive at Nabble.com.


error in suggester component in solr

2013-10-28 Thread anurag.sharma
I am working with solr auto complete functionality,I am using solr 4.50 to
build my application, and I am following this link as a reference.
http://lucene.472066.n3.nabble.com/auto-completion-search-with-solr-using-NGrams-in-SOLR-td3998559i20.html

 My suggest component is something like this

  searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
  str name=namesuggest/str
  str
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str  
  str name=storeDirsuggest/str
  str name=fieldautocomplete_text/str
  bool name=exactMatchFirsttrue/bool
  float name=threshold0.005/float
  str name=buildOnCommittrue/str
  str name=buildOnOptimizetrue/str
/lst
   lst name=spellchecker
  str name=namejarowinkler/str  
  str name=fieldlowerfilt/str  
  str
name=distanceMeasureorg.apache.lucene.search.spell.JaroWinklerDistance/str 
 
  str name=spellcheckIndexDirspellchecker/str  
   /lst
 str name=queryAnalyzerFieldTypeedgytext/str  
  /searchComponent


but, I am getting the following error

*org.apache.solr.spelling.suggest.Suggester  – Loading stored lookup
data failed
java.io.FileNotFoundException:
/home/anurag/Downloads/solr-4.4.0/example/solr/collection1/data/suggest/tst.dat
(No such file or directory)*

It says that some file are missing but the solr wiki suggester component
says it supports these lookupImpls --

*str
name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
  
*
Dont know what I am doing wrong. Any help will be deeply appreciated




--
View this message in context: 
http://lucene.472066.n3.nabble.com/error-in-suggester-component-in-solr-tp4098028.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Newbie to Solr

2013-10-28 Thread Mamta Alshi
Hi Alex,

I have been able to run a few simple queries with my own schema.xml and
data file. My concern now is that i'm able to run queries like

http://localhost:8983/solr/select/?q=*:*

http://localhost:8983/solr/select/?q=*:*facet=truefacet.field=Name

from the url

However, when I try to run them like this

*:*facet=truefacet.field=Name

from the query string text box it gives me error like undefined field *.

Any idea what is going wrong?

TIA




On Sun, Oct 27, 2013 at 1:28 PM, Mamta Alshi mamta.al...@gmail.com wrote:

 Hi Alex,

 That is what I am suspecting too. Trying to remove the other files from
 the exampledocs directory is not helping. After removing all files except
 the details.xml also the results still show me data from the other files
 but not my file.

 I am making changes to the same path which is displayed in Web Admin's
 dashboard.

 My last option will be to delete solr ,install it again and try.

 Thanks for your prompt response.


 On Sun, Oct 27, 2013 at 1:04 PM, Alexandre Rafalovitch arafa...@gmail.com
  wrote:

 Maybe your Solr instance is somehow using a different collection
 directory?

 In Web Admin's dashboard section, it shows the path to where it thinks the
 instance is. Does it match to what you expected?

 If it does, try deleting the core directory, restarting Solr and doing
 indexing again. Maybe you have some old stuff there accidentally.

 Regards,
Alex

 Personal website: http://www.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


 On Sun, Oct 27, 2013 at 3:45 PM, Mamta Alshi mamta.al...@gmail.com
 wrote:

  Hi,
 
  On trying to create a new schema.xml it shows the schema from the solr
  console. I have created a new data file called details.xml and placed
 it in
  the folder exampledocs. I have indexed just this one file from the
 command
  prompt.
 
  However,on my solr console in my query string when I query *:* it does
 not
  show me the contents from details.xml.
  It shows me contents of some other data file.
 
  Am I missing out on something?
 
  TIA .
 
 
  On Tue, Oct 1, 2013 at 3:16 PM, Kishan Parmar kishan@gmail.com
  wrote:
 
   yes you have to create your own schema
   but in schema file you have to add your xml files field name in it
 like
   wise
   you can add your field name in it 
  
   or you can add  your filed in the default schema file
  
   whiithout schema you can not add your xml file to solr
  
   my schema is like this
  
  
 
 --
   ?xml version=1.0 encoding=UTF-8 ?
   schema name=example version=1.5
   fields
field name=No type=string indexed=true stored=true
   required=true multiValued=false /
field name=Name type=string indexed=true stored=true
   required=true multiValued=false /
field name=Address type=string indexed=true stored=true
   required=true multiValued=false /
field name=Mobile type=string indexed=true stored=true
   required=true multiValued=false /
   /fields
   uniqueKeyNo/uniqueKey
  
   types
  
 fieldType name=string class=solr.StrField
 sortMissingLast=true
  /
 fieldType name=int class=solr.TrieIntField precisionStep=0
   positionIncrementGap=0 /
   /types
   /schema
  
  
 
 -
  
   and my file is like this ,,.,.,.,.
  
  
  
 
 -
   add
   doc
   field name=No100120107088/field
   field name=Namekishan/field
   field name=Addressghatlodia/field
   field name=Mobile9510077394/field
   /doc
   /add
  
   Regards,
  
   Kishan Parmar
   Software Developer
   +91 95 100 77394
   Jay Shree Krishnaa !!
  
  
  
   On Tue, Oct 1, 2013 at 1:11 AM, mamta mamta.al...@gmail.com wrote:
  
Hi,
   
I want to know that if i have to fire some query through the Solr
  admin,
   do
i need to create a new schema.xml? Where do i place it incase iahve
 to
create a new one.
   
Incase i can edit the original schema.xml can there be two fields
 named
   id
in my schema.xml?
   
I desperately need help in running queries on the Solr admin which
 is
configured on a Tomcat server.
   
What all preparation will i need to do? Schema.xml any docs?
   
Any help will be highly appreciated.
   
Thanks,
Mamta
   
   
   
--
View this message in context:
http://lucene.472066.n3.nabble.com/Newbie-to-Solr-tp4092876.html
Sent from the Solr - User mailing list archive at 

Re: Newbie to Solr

2013-10-28 Thread michael.boom
Put *:* in the q field
Then check the facet check box (look lower close to the Execute button) and
in the facet.field insert Name.
This should do the trick.



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-to-Solr-tp4092876p4098031.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: AW: AW: auto completion search with solr using NGrams in SOLR

2013-10-28 Thread anurag.sharma
Hi ... I am trying to build autocomplete functionality using your post. But I
am getting the following error

*2577 [coreLoadExecutor-3-thread-1] WARN 
org.apache.solr.spelling.suggest.Suggester  – Loading stored lookup data
failed
java.io.FileNotFoundException:
/home/anurag/Downloads/solr-4.4.0/example/solr/collection1/data/suggest/tst.dat
(No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.init(FileInputStream.java:137)
at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:116)
at
org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601)
at org.apache.solr.core.SolrCore.init(SolrCore.java:830)
at org.apache.solr.core.SolrCore.init(SolrCore.java:629)
*

I am using solr 4.4. Is the suggester component still works in this version



--
View this message in context: 
http://lucene.472066.n3.nabble.com/auto-completion-search-with-solr-using-NGrams-in-SOLR-tp3998559p4098032.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Background merge errors with Solr 4.4.0 on Optimize call

2013-10-28 Thread Erick Erickson
For Tomcat, the Solr is often put into catalina.out
as a default, so the output might be there. You can
configure Solr to send the logs most anywhere you
please, but without some specific setup
on your part the log output just goes to the default
for the servlet.

I took a quick glance at the code but since the merges
are happening in the background, there's not much
context for where that error is thrown.

How much memory is there for the JVM? I'm grasping
at straws a bit...

Erick


On Sun, Oct 27, 2013 at 9:54 PM, Matthew Shapiro m...@mshapiro.net wrote:

 I am working at implementing solr to work as the search backend for our web
 system.  So far things have been going well, but today I made some schema
 changes and now things have broken.

 I updated the schema.xml file and reloaded the core (via the admin
 interface).  No errors were reported in the logs.

 I then pushed 100 records to be indexed.  A call to Commit afterwards
 seemed fine, however my next call for Optimize caused the following errors:

 java.io.IOException: background merge hit exception:
 _2n(4.4):C4263/154 _30(4.4):C134 _32(4.4):C10 _31(4.4):C10 into _37
 [maxNumSegments=1]

 null:java.io.IOException: background merge hit exception:
 _2n(4.4):C4263/154 _30(4.4):C134 _32(4.4):C10 _31(4.4):C10 into _37
 [maxNumSegments=1]


 Unfortunately, googling for background merge hit exception came up
 with 2 thing: a corrupt index or not enough free space.  The host
 machine that's hosting solr has 227 out of 229GB free (according to df
 -h), so that's not it.


 I then ran CheckIndex on the index, and got the following results:
 http://apaste.info/gmGU


 As someone who is new to solr and lucene, as far as I can tell this
 means my index is fine. So I am coming up at a loss. I'm somewhat sure
 that I could probably delete my data directory and rebuild it but I am
 more interested in finding out why is it having issues, what is the
 best way to fix it, and what is the best way to prevent it from
 happening when this goes into production.


 Does anyone have any advice that may help?


 As an aside, i do not have a stacktrace for you because the solr admin
 page isn't giving me one.  I tried looking in my logs file in my solr
 directory, but it does not contain any logs.  I opened up my
 ~/tomcat/lib/log4j.properties file and saw http://apaste.info/0rTL,
 which didnt really help me find log files.  Doing a 'find . | grep
 solr.log' didn't really help either.  Any help for finding log files
 (which may help find the actual cause of this) would also be
 appreciated.



Re: Newbie to Solr

2013-10-28 Thread Mamta Alshi
Hi Michael,

Thanks for the prompt response. Have a look at my attached admin user
interfaces.

I do not quite see the options you mention.


On Mon, Oct 28, 2013 at 2:18 PM, michael.boom my_sky...@yahoo.com wrote:

 Put *:* in the q field
 Then check the facet check box (look lower close to the Execute button) and
 in the facet.field insert Name.
 This should do the trick.



 -
 Thanks,
 Michael
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Newbie-to-Solr-tp4092876p4098031.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Newbie to Solr

2013-10-28 Thread Mamta Alshi
how do I get the solr admin web user interface?


On Mon, Oct 28, 2013 at 2:32 PM, Mamta Alshi mamta.al...@gmail.com wrote:

 Hi Michael,

 Thanks for the prompt response. Have a look at my attached admin user
 interfaces.

 I do not quite see the options you mention.


 On Mon, Oct 28, 2013 at 2:18 PM, michael.boom my_sky...@yahoo.com wrote:

 Put *:* in the q field
 Then check the facet check box (look lower close to the Execute button)
 and
 in the facet.field insert Name.
 This should do the trick.



 -
 Thanks,
 Michael
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Newbie-to-Solr-tp4092876p4098031.html
 Sent from the Solr - User mailing list archive at Nabble.com.





Re: When is/should qf different from pf?

2013-10-28 Thread Erick Erickson
The facetious answer is when phrases aren't important in the fields.
If you're doing a simple boolean match, adding phrase fields will add
expense, to no good purpose etc. Phrases on numeric
fields seems wrong.

FWIW,
Erick


On Mon, Oct 28, 2013 at 1:03 AM, Amit Nithian anith...@gmail.com wrote:

 Hi all,

 I have been using Solr for years but never really stopped to wonder:

 When using the dismax/edismax handler, when do you have the qf different
 from the pf?

 I have always set them to be the same (maybe different weights) but I was
 wondering if there is a situation where you would have a field in the qf
 not in the pf or vice versa.

 My understanding from the docs is that qf is a term-wise hard filter while
 pf is a phrase-wise boost of documents who made it past the qf filter.

 Thanks!
 Amit



Re: Solr Update URI is not found

2013-10-28 Thread Erick Erickson
This seems like a better question for the Nutch list. I see hadoop
in there, so unless you've specifically configured solr to use
the HDFS directory writer factory, this has to be coming from
someplace else. And there are map/reduce tasks in here.

BTW, it would be more helpful if you posted the URL that you
successfully queried Solr with... What is the /2 on the end for?
Do you use that when you query?

Best,
Erick


On Mon, Oct 28, 2013 at 2:37 AM, Bayu Widyasanyata
bwidyasany...@gmail.comwrote:

 On Mon, Oct 28, 2013 at 1:26 PM, Raymond Wiker rwi...@gmail.com wrote:

   request: http://localhost:8080/solr/update?wt=javabinversion=2
 
  I think this url is incorrect: there should be a core name between solr
  and update.
 

 I changed th SolrURL on crawl script's option to:

 ./bin/crawl urls/seed.txt TestCrawl
 http://localhost:8080/solr/mycollection/2

 And the result now is Bad Request.
 I will look for another misconfiguration things...

 =

 org.apache.solr.common.SolrException: Bad Request

 Bad Request

 request:
 http://localhost:8080/solr/mycollection/update?wt=javabinversion=2
 at

 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
 at

 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
 at

 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
 at

 org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
 at
 org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
 at

 org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
 at

 org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467)
 at
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:535)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
 at
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
 2013-10-28 13:30:02,804 ERROR indexer.IndexingJob - Indexer:
 java.io.IOException: Job failed!
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
 at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
 at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)



 --
 wassalam,
 [bayu]



Re: Newbie to Solr

2013-10-28 Thread michael.boom
I don't see the mentioned attachement.
Try using http://snag.gy/ to provide it.

As for where do you find it, the default is
http://localhost:8983/solr/collection1/query



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Newbie-to-Solr-tp4092876p4098041.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Optimal interval for soft commit

2013-10-28 Thread Mugoma Joseph O.
Hello,

 How do you add the documents to the index - one by one, batches of n ?

Documents are added one by one using solrj

 When do you do your commits ?

We have the following settings in solrconfig.xml:


 autoCommit
   maxTime180/maxTime
   openSearcherfalse/openSearcher
 /autoCommit


   autoSoftCommit
 maxTime15000/maxTime
   /autoSoftCommit



Thanks.

Mugoma.


On Mon, October 28, 2013 12:22 pm, michael.boom wrote:
 How do you add the documents to the index - one by one, batches of n ?
 When
 do you do your commits ?
 Because 8k docs per day is not a lot. Depending on the above, commiting
 with
 softCommit=true might also be a solution.



 -
 Thanks,
 Michael
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Optimal-interval-for-soft-commit-tp4098016p4098022.html
 Sent from the Solr - User mailing list archive at Nabble.com.





Apache-Solr with Tomcat: displaying the format of search result

2013-10-28 Thread pyramesh
Hi All,

Recently I have integrated Apache solr with Tomcat server.everything is
working fine. I am displaying the search result using velocity template.

But Here is my problem. search results are displaying the correct format as
input data format.

For Example: input data (whole data contains in single field):: 

*issue*: description about issue.
*Solution*: Solution given user goes here.

but after index the data , the data displaying in the below format

in the search result :: *issue*: description about issue.*Solution*:
Solution given user goes here.

But this is not I want.. I want to display data as same as input format.

can anyone please help on this


Thanks in Advance ...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Apache-Solr-with-Tomcat-displaying-the-format-of-search-result-tp4098040.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Compound words

2013-10-28 Thread Erick Erickson
Why did you reject using synonyms? You can have multi-word
synonyms just fine at index time, and at query time, since the
multiple words are already substituted in the index you don't
need to do the same substitution, just query the raw strings.

I freely acknowledge you may have very good reasons for doing
this yourself, I'm just making sure you know what's already
there.

See:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

Look particularly at the explanations for sea biscuit in that section.

Best,
Erick



On Mon, Oct 28, 2013 at 3:47 AM, Parvesh Garg parv...@zettata.com wrote:

 One more thing, Is there a way to remove my accidentally sent phone number
 in the signature from the previous mail? aarrrggghhh



Re: Optimal interval for soft commit

2013-10-28 Thread Erick Erickson
To reply to your original question, when you soft commit
the top-level caches are thrown away. I.e. the filterCache,
documentResultCache, all the ones in solrconfig.xml.

And if you have a high autowarm count on them, you wind
up doing a lot of work for no gain. Say your soft commit
interval is 1 second. Only queries that come in during that
one second even _potentially_ use the caches.

Here's a long blog with lots of background:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Try this:
1 set your soft commit interval to 1
2 set your cache sizes in solrconfig to 5
3 set your autowarm counts in 2 to 0.

try it. If you see unacceptable degradation in query performance,
then this is too aggressive and you need some caching.
If not, don't bother caching.

As always, it's a tradeoff between how fast docs are searchable
and how much you can improve things with caching.

Best,
Erick


On Mon, Oct 28, 2013 at 6:42 AM, Mugoma Joseph O. mug...@yengas.com wrote:

 Hello,

  How do you add the documents to the index - one by one, batches of n ?

 Documents are added one by one using solrj

  When do you do your commits ?

 We have the following settings in solrconfig.xml:


  autoCommit
maxTime180/maxTime
openSearcherfalse/openSearcher
  /autoCommit


autoSoftCommit
  maxTime15000/maxTime
/autoSoftCommit



 Thanks.

 Mugoma.


 On Mon, October 28, 2013 12:22 pm, michael.boom wrote:
  How do you add the documents to the index - one by one, batches of n ?
  When
  do you do your commits ?
  Because 8k docs per day is not a lot. Depending on the above, commiting
  with
  softCommit=true might also be a solution.
 
 
 
  -
  Thanks,
  Michael
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Optimal-interval-for-soft-commit-tp4098016p4098022.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 





Re: Solr Update URI is not found

2013-10-28 Thread Bayu Widyasanyata
Hi Erick and All,

The problem is solved by copying schema-solr4.xml into my collection's Solr
conf (renamed to schema.xml).
I didn't use hadoop there, and apologize if it's better to post on this
Solr list since the problem appeared first on Solr Indexer step.

Regarding /2 option it's e-mail body evolution I thought :)
On my first posting, that was a crawl script syntax, as on my case:

# ./bin/crawl urls/seed.txt TestCrawl http://localhost:8080/solr/ 2

2 = the number of rounds.

See here:
http://wiki.apache.org/nutch/NutchTutorial#A3.3._Using_the_crawl_script

Again, thanks everyone!


On Mon, Oct 28, 2013 at 5:39 PM, Erick Erickson erickerick...@gmail.comwrote:

 This seems like a better question for the Nutch list. I see hadoop
 in there, so unless you've specifically configured solr to use
 the HDFS directory writer factory, this has to be coming from
 someplace else. And there are map/reduce tasks in here.

 BTW, it would be more helpful if you posted the URL that you
 successfully queried Solr with... What is the /2 on the end for?
 Do you use that when you query?

 Best,
 Erick


 On Mon, Oct 28, 2013 at 2:37 AM, Bayu Widyasanyata
 bwidyasany...@gmail.comwrote:

  On Mon, Oct 28, 2013 at 1:26 PM, Raymond Wiker rwi...@gmail.com wrote:
 
request: http://localhost:8080/solr/update?wt=javabinversion=2
  
   I think this url is incorrect: there should be a core name between
 solr
   and update.
  
 
  I changed th SolrURL on crawl script's option to:
 
  ./bin/crawl urls/seed.txt TestCrawl
  http://localhost:8080/solr/mycollection/2
 
  And the result now is Bad Request.
  I will look for another misconfiguration things...
 
  =
 
  org.apache.solr.common.SolrException: Bad Request
 
  Bad Request
 
  request:
  http://localhost:8080/solr/mycollection/update?wt=javabinversion=2
  at
 
 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
  at
 
 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
  at
 
 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
  at
 
 
 org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
  at
  org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
  at
 
 
 org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
  at
 
 
 org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467)
  at
  org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:535)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
  at
  org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
  2013-10-28 13:30:02,804 ERROR indexer.IndexingJob - Indexer:
  java.io.IOException: Job failed!
  at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
  at
 org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
  at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
  at
 org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)
 
 
 
  --
  wassalam,
  [bayu]
 




-- 
wassalam,
[bayu]


Solr 4.5.1 replication Bug? Illegal to have multiple roots (start tag in epilog?).

2013-10-28 Thread Sai Gadde
we have a similar error as this thread.

http://www.mail-archive.com/solr-user@lucene.apache.org/msg90748.html

Tried tomcat setting from this post. We used exact setting sepecified
here. we merge 500 documents at a time. I am creating a new thread
because Michael is using Jetty where as we use Tomcat.


formdataUploadLimitInKB and multipartUploadLimitInKB limits are set to very
high value 2GB. As suggested in the following thread.
https://issues.apache.org/jira/browse/SOLR-5331


We use out of the box Solr 4.5.1 no customization done. If we merge
documents via SolrJ to a single server it is perfectly working fine.


 But as soon as we add another node to the cloud we are getting
following while merging documents.



This is the error we are getting on the server (10.10.10.116 - IP is
irrelavent just for clarity)where merging is happening. 10.10.10.119
is the new node here. This server gets RemoteSolrException


shard update error StdNode:
http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Illegal to have multiple roots (start tag in epilog?).
 at [row,col {unknown-source}]: [1,12468]
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
at 
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)





On the other server 10.10.10.119 we get following error


org.apache.solr.common.SolrException: Illegal to have multiple roots
(start tag in epilog?).
 at [row,col {unknown-source}]: [1,12468]
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have
multiple roots (start tag in epilog?).
 at [row,col {unknown-source}]: [1,12369]
at 
com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
at 
com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
at 
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
at 
com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647)

Field Value depending on another field value

2013-10-28 Thread bengates
Hello,

I'm pretty new to Solr, and I have a question about best practice.

I want to handle a Solr collection with products that are available in
different shops.
For several reasons, the price of a product may be the same or vary,
depending the shop's location.

What I don't know how to handle correctly is the ability to have a price
that is a multivalued notion, which value depends on another field.

Imagine the following product into the collection :
{
id: 123456,
name: The Wonderful product,
SellableInShop: [1, 3],
Price: 0,
PriceInShop1: 34.99,
PriceInShop2: 0,
PriceInShop3: 38.99
}

Behaviour I want when the user searchs for wonderful after selecting the
shop #3
/query?q=wonderful AND SellableInShops:3

{
id: 123456,
name: The Wonderful product,
SellableInShop: [1, 3],
Price: 38.99
}

My question is : how to fill, at query-time, the content of a field Price,
depending on 2 other fields : SellableInShop and PriceInShop3 (PriceInShop2
if SellableInShop == 2, PriceInShop1 if SellableInShop == 1, etc) ?

Thanks a lot,
Ben



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-Value-depending-on-another-field-value-tp4098047.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Data import handler with multi tables

2013-10-28 Thread Stefan Matheis
 I think because id is unique. When importing tbl_tableA import first,
 tbl_tableB import after. tbl_tableB has id which the same id in tableA, so
 only data of tableB had indexed with unique id.
 
 

That's exactly what happens here :) If the second table would have fewer 
records than the first one, you'd still see records from that table.

 Anyone can help me to configure data import handler that can index all data
 of two (more) tables which have the same id in each table.
 
 

that requires the use of a key which is known as compound key 
(http://en.wikipedia.org/wiki/Compound_key), f.e. if data comes from Table A .. 
make it A1 instead of (only) 1, A2, B1, B2 .. and so on. you can still index 
the raw id's in another field .. but for the unique key .. you need something 
like that, to get it working.


HTH
Stefan



On Monday, October 28, 2013 at 10:45 AM, dtphat wrote:

 Hi,
 I wanna to import many tables from MySQL. Assume that, I have two tables:
 *** Tables 1: tbl_tableA(id, nameA) with data (1, A1), (2, A2), (3, A3).
 *** Tables 2: tbl_tableB(id, nameB) with data (1, B1), (2, B2), (3, B3), (4,
 B4), (5, B5).
 
 I configure:
 dataConfig
 dataSource type=JdbcDataSource 
 driver=com.mysql.jdbc.Driver 
 url=jdbc:mysql://xx 
 user=xxx password=xxx batchSize=1 /
 
 document name = atexpats6
 
 entity name=tableA 
 query=select * from tbl_tableA
 field name=id column=id/
 field name=nameA column=nameA /
 /entity
 
 
 entity name=tableB 
 query=select * from tbl_tableB
 field name=id column=id/
 field name=nameA column=nameA /
 /entity
 /document
 /dataConfig
 
 I define nameA, nameB in schema.xml and id is configured by
 uniqueKeyid/uniqueKey
 
 When I import data by
 http://localhost:8983/solr/dataimport?command=full-import
 
 It's successfull. But only data of tbl_tableB had indexed.
 
 I think because id is unique. When importing tbl_tableA import first,
 tbl_tableB import after. tbl_tableB has id which the same id in tableA, so
 only data of tableB had indexed with unique id.
 
 Anyone can help me to configure data import handler that can index all data
 of two (more) tables which have the same id in each table.
 
 Thanks.
 
 
 
 -
 Phat T. Dong
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Data-import-handler-with-multi-tables-tp4098026.html
 Sent from the Solr - User mailing list archive at Nabble.com 
 (http://Nabble.com).
 
 




Re: Compound words

2013-10-28 Thread Parvesh Garg
Hi Erick,

Thanks for the suggestion. Like I said, I'm an infant.

We tried synonyms both ways. sea biscuit = seabiscuit and seabiscuit =
sea biscuit and didn't understand exactly how it worked. But I just checked
the analysis tool, and it seems to work perfectly fine at index time. Now,
I can happily discard my own filter and 4 days of work. I'm happy I got to
know a few ways on how/when not to write a solr filter :)

I tried the string sea biscuit sea bird with expand=false and the tokens
i got were seabiscuit sea bird at 1,2 and 3 positions respectively. But at
query time, when I enter the same term sea biscuit sea bird, using
edismax and qf, pf2, and pf3, the parsedQuery looks like this:

+((text:sea) (text:biscuit) (text:sea) (text:bird)) ((text:\biscuit sea\)
(text:\sea bird\)) ((text:\seabiscuit sea\) (text:\biscuit sea
bird\))

What I wanted instead was this

+((text:seabiscuit) (text:sea) (text:bird)) ((text:\seabiscuit sea\)
(text:\sea bird\)) (text:\seabiscuit sea bird\)

Looks like there isn't any other way than to pre-process query myself and
create the compound word. What do you mean by just query the raw string?
Am I still missing something?

Parvesh Garg
http://www.zettata.com
(This time I did remove my phone number :) )

On Mon, Oct 28, 2013 at 4:14 PM, Erick Erickson erickerick...@gmail.comwrote:

 Why did you reject using synonyms? You can have multi-word
 synonyms just fine at index time, and at query time, since the
 multiple words are already substituted in the index you don't
 need to do the same substitution, just query the raw strings.

 I freely acknowledge you may have very good reasons for doing
 this yourself, I'm just making sure you know what's already
 there.

 See:

 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

 Look particularly at the explanations for sea biscuit in that section.

 Best,
 Erick



 On Mon, Oct 28, 2013 at 3:47 AM, Parvesh Garg parv...@zettata.com wrote:

  One more thing, Is there a way to remove my accidentally sent phone
 number
  in the signature from the previous mail? aarrrggghhh
 



Re: One of all shard stopping, all shards stop

2013-10-28 Thread hongkeun.yoo
Thanks for your reply. If one of server have stop and error, this
option(distrib=false) is good work. Similarly option is
shards.tolerant=true. but I don't want to using this option. because the
died server isn't show error message. only return not nothing data.

I want to show error message at died server, the other way normal server
work normally.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/One-of-all-shard-stopping-all-shards-stop-tp4098015p4098053.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Data import handler with multi tables

2013-10-28 Thread dtphat
Hi,
is there no another way to import all data for this case instead Only the
way using compound key?
Thanks.



-
Phat T. Dong
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Re-Data-import-handler-with-multi-tables-tp4098048p4098056.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Compound words

2013-10-28 Thread Erick Erickson
Consider setting expand=true at index time. That
puts all the tokens in your index, and then you
may not need to have any synonym
processing at query time since all the variants will
already be in the index.

As it is, you've replaced the words in the original with
synonyms, essentially collapsed them down to a single
word and then you have to do something at query time
to get matches. If all the variants are in the index, you
shouldn't have to. That's what I meant by raw.

Best,
Erick


On Mon, Oct 28, 2013 at 8:02 AM, Parvesh Garg parv...@zettata.com wrote:

 Hi Erick,

 Thanks for the suggestion. Like I said, I'm an infant.

 We tried synonyms both ways. sea biscuit = seabiscuit and seabiscuit =
 sea biscuit and didn't understand exactly how it worked. But I just checked
 the analysis tool, and it seems to work perfectly fine at index time. Now,
 I can happily discard my own filter and 4 days of work. I'm happy I got to
 know a few ways on how/when not to write a solr filter :)

 I tried the string sea biscuit sea bird with expand=false and the tokens
 i got were seabiscuit sea bird at 1,2 and 3 positions respectively. But at
 query time, when I enter the same term sea biscuit sea bird, using
 edismax and qf, pf2, and pf3, the parsedQuery looks like this:

 +((text:sea) (text:biscuit) (text:sea) (text:bird)) ((text:\biscuit sea\)
 (text:\sea bird\)) ((text:\seabiscuit sea\) (text:\biscuit sea
 bird\))

 What I wanted instead was this

 +((text:seabiscuit) (text:sea) (text:bird)) ((text:\seabiscuit sea\)
 (text:\sea bird\)) (text:\seabiscuit sea bird\)

 Looks like there isn't any other way than to pre-process query myself and
 create the compound word. What do you mean by just query the raw string?
 Am I still missing something?

 Parvesh Garg
 http://www.zettata.com
 (This time I did remove my phone number :) )

 On Mon, Oct 28, 2013 at 4:14 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  Why did you reject using synonyms? You can have multi-word
  synonyms just fine at index time, and at query time, since the
  multiple words are already substituted in the index you don't
  need to do the same substitution, just query the raw strings.
 
  I freely acknowledge you may have very good reasons for doing
  this yourself, I'm just making sure you know what's already
  there.
 
  See:
 
 
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
 
  Look particularly at the explanations for sea biscuit in that section.
 
  Best,
  Erick
 
 
 
  On Mon, Oct 28, 2013 at 3:47 AM, Parvesh Garg parv...@zettata.com
 wrote:
 
   One more thing, Is there a way to remove my accidentally sent phone
  number
   in the signature from the previous mail? aarrrggghhh
  
 



Re: One of all shard stopping, all shards stop

2013-10-28 Thread Erick Erickson
I think if you set shards.tolerant=true you get information in the
return packet if a shard is completely down.

The other thing you can do is query the ZooKeeper cluster state
directly.

But I have to ask why you're not using a replica or two per shard.
That should provide automatic fail-over etc and make the necessity
of dealing with this case _much_ less frequent. Personally I'd put
more effort into making an always-up cluster than dealing with
when a single node goes down.

FWIW,
Erick


On Mon, Oct 28, 2013 at 8:10 AM, hongkeun.yoo hunter...@naver.com wrote:

 Thanks for your reply. If one of server have stop and error, this
 option(distrib=false) is good work. Similarly option is
 shards.tolerant=true. but I don't want to using this option. because the
 died server isn't show error message. only return not nothing data.

 I want to show error message at died server, the other way normal server
 work normally.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/One-of-all-shard-stopping-all-shards-stop-tp4098015p4098053.html
 Sent from the Solr - User mailing list archive at Nabble.com.



return value from SolrJ client to php

2013-10-28 Thread Amit Aggarwal
Hello All,

I have a requirement where I have to conect to Solr using SolrJ client and
documents return by solr to SolrJ client have to returned to PHP.

I know its simple to get document from Solr to SolrJ
But how do I return documents from SolrJ to PHP ?


Thanks
Amit Aggarwal


Re: Field Value depending on another field value

2013-10-28 Thread Anshum Gupta
Hi Ben,

You can actually look at indexing single valued documents i.e. a different
one for every store and then group by on the product id.
Have a look at this presentation by Adrian Trenaman at the Lucene
Revolution earlier this year:

Presentation:
http://www.slideshare.net/trenaman/personalized-search-on-the-largest-flash-sale-site-in-america
Video: http://www.youtube.com/watch?v=kJa-3PEc90g

Hope that helps you.



On Mon, Oct 28, 2013 at 5:06 PM, bengates benga...@aliceadsl.fr wrote:

 Hello,

 I'm pretty new to Solr, and I have a question about best practice.

 I want to handle a Solr collection with products that are available in
 different shops.
 For several reasons, the price of a product may be the same or vary,
 depending the shop's location.

 What I don't know how to handle correctly is the ability to have a price
 that is a multivalued notion, which value depends on another field.

 Imagine the following product into the collection :
 {
 id: 123456,
 name: The Wonderful product,
 SellableInShop: [1, 3],
 Price: 0,
 PriceInShop1: 34.99,
 PriceInShop2: 0,
 PriceInShop3: 38.99
 }

 Behaviour I want when the user searchs for wonderful after selecting the
 shop #3
 /query?q=wonderful AND SellableInShops:3

 {
 id: 123456,
 name: The Wonderful product,
 SellableInShop: [1, 3],
 Price: 38.99
 }

 My question is : how to fill, at query-time, the content of a field
 Price,
 depending on 2 other fields : SellableInShop and PriceInShop3 (PriceInShop2
 if SellableInShop == 2, PriceInShop1 if SellableInShop == 1, etc) ?

 Thanks a lot,
 Ben



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Field-Value-depending-on-another-field-value-tp4098047.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 

Anshum Gupta
http://www.anshumgupta.net


Re: return value from SolrJ client to php

2013-10-28 Thread Anshum Gupta
Hi Amit,

I haven't personally tried it, but have a look at the options listed here:
http://wiki.apache.org/solr/IntegratingSolr

Also, just check if the library you try is known to work with the version
of Solr you'd want to use.

Otherwise, how about just using a serialization library for apps in the 2
languages to talk to each other?




On Mon, Oct 28, 2013 at 7:03 PM, Amit Aggarwal amit.aggarwa...@gmail.comwrote:

 Hello All,

 I have a requirement where I have to conect to Solr using SolrJ client and
 documents return by solr to SolrJ client have to returned to PHP.

 I know its simple to get document from Solr to SolrJ
 But how do I return documents from SolrJ to PHP ?


 Thanks
 Amit Aggarwal




-- 

Anshum Gupta
http://www.anshumgupta.net


Re: Proposal for new feature, cold replicas, brainstorming

2013-10-28 Thread Toke Eskildsen
On Sat, 2013-10-26 at 02:14 +0200, Chris Hostetter wrote:
 I suspect that the most straight forward way to achieve what you are 
 folks seem to be describing would be to add a hook into the request 
 distribution processing so that you could have a custom plugin used when 
 solr does Replica r = pickReplica(shardName) and your implimentation of 
 pickReplica() would look something like (all psuedo code)...
 
   ListReplica allInShard = clusterState.getAllLiveReplicas(shardName)
   ListReplica candidates = new List();
   for (Replica r : allInShard) {
 if (! r.hasRole(shardIsLastResort) ) {
   candaites.add(r);
 }
   return candaidates.isEmpty() ? allInShard : candidates;

I am not vary familiar with the distribution code in Solr. I located
CloudSolrServer.request(SolrRequest request) which seems to be the place
you are talking about? It extracts replica URLs and generates a
LBHttpSolrServer.Req with that list, which it immediately used with the
LBHttpSolrServer.

As I understand it, feeding LBHttpSolrServer.Req with only shards that
are primary, would mean an exception if those shards does not answer. In
order to handle the first search against a failed primary shard
gracefully, wouldn't we need to extend the LBHttpSolrServer.Req to have
two lists, primary and lastResort, instead of one? This would also
require a rewrite of the try-retry logic in LBHttpSolrServer.

 ...if i remember correctly, there is already a hook (or there is an issue 
 about adding a hook) to let you do plugin logic like this -- [...]

I did not see one in the code and could not locate a JIRA issue. Not
that it means that it isn't there.

Thank you for your time,
Toke Eskildsen



Re: Compound words

2013-10-28 Thread Roman Chyla
Hi Parvesh,
I think you should check the following jira
https://issues.apache.org/jira/browse/SOLR-5379. You will find there links
to other possible solutions/problems:-)
Roman
On 28 Oct 2013 09:06, Erick Erickson erickerick...@gmail.com wrote:

 Consider setting expand=true at index time. That
 puts all the tokens in your index, and then you
 may not need to have any synonym
 processing at query time since all the variants will
 already be in the index.

 As it is, you've replaced the words in the original with
 synonyms, essentially collapsed them down to a single
 word and then you have to do something at query time
 to get matches. If all the variants are in the index, you
 shouldn't have to. That's what I meant by raw.

 Best,
 Erick


 On Mon, Oct 28, 2013 at 8:02 AM, Parvesh Garg parv...@zettata.com wrote:

  Hi Erick,
 
  Thanks for the suggestion. Like I said, I'm an infant.
 
  We tried synonyms both ways. sea biscuit = seabiscuit and seabiscuit =
  sea biscuit and didn't understand exactly how it worked. But I just
 checked
  the analysis tool, and it seems to work perfectly fine at index time.
 Now,
  I can happily discard my own filter and 4 days of work. I'm happy I got
 to
  know a few ways on how/when not to write a solr filter :)
 
  I tried the string sea biscuit sea bird with expand=false and the
 tokens
  i got were seabiscuit sea bird at 1,2 and 3 positions respectively. But
 at
  query time, when I enter the same term sea biscuit sea bird, using
  edismax and qf, pf2, and pf3, the parsedQuery looks like this:
 
  +((text:sea) (text:biscuit) (text:sea) (text:bird)) ((text:\biscuit
 sea\)
  (text:\sea bird\)) ((text:\seabiscuit sea\) (text:\biscuit sea
  bird\))
 
  What I wanted instead was this
 
  +((text:seabiscuit) (text:sea) (text:bird)) ((text:\seabiscuit sea\)
  (text:\sea bird\)) (text:\seabiscuit sea bird\)
 
  Looks like there isn't any other way than to pre-process query myself and
  create the compound word. What do you mean by just query the raw
 string?
  Am I still missing something?
 
  Parvesh Garg
  http://www.zettata.com
  (This time I did remove my phone number :) )
 
  On Mon, Oct 28, 2013 at 4:14 PM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   Why did you reject using synonyms? You can have multi-word
   synonyms just fine at index time, and at query time, since the
   multiple words are already substituted in the index you don't
   need to do the same substitution, just query the raw strings.
  
   I freely acknowledge you may have very good reasons for doing
   this yourself, I'm just making sure you know what's already
   there.
  
   See:
  
  
 
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
  
   Look particularly at the explanations for sea biscuit in that
 section.
  
   Best,
   Erick
  
  
  
   On Mon, Oct 28, 2013 at 3:47 AM, Parvesh Garg parv...@zettata.com
  wrote:
  
One more thing, Is there a way to remove my accidentally sent phone
   number
in the signature from the previous mail? aarrrggghhh
   
  
 



Re: Solr - what's the next big thing?

2013-10-28 Thread Otis Gospodnetic
Hi,

On Sun, Oct 27, 2013 at 2:57 PM, Saar Carmi saarca...@gmail.com wrote:
 If I get it right, Solr can store its data files on HDFS but it will not

Correct.
And can be used to build indices in parallel, using MapReduce, from
data living on HDFS.

 use map reduce to process the data (e.g. evaluating queries).

Right. MapReduce jobs are typically not a sub-second process, while
search queries typically need to be very quick.
That said, one could run a query and then apply MapReduce-based
processing on the search results.  There is no support for that in
Solr today.

 I was wondering whether Solr could utilize the Hadoop job distribution
 mechanism to utlize resources better.
 On the otherhand, maybe this is not needed with the availability of Solr
 Cloud.

Maybe you are thinking Solr on YARN?

Mark Miller can probably say a word or two or three on this topic.

 Bill Bell, could you elaborate about complex object indexing?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


 On Sat, Oct 26, 2013 at 10:04 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:

 Hi,

 On Sat, Oct 26, 2013 at 5:58 AM, Saar Carmi saarca...@gmail.com wrote:
  LOL,  Jack.  I can imagine Otis saying that.

 Funny indeed, but not really.

  Otis,  with these marriage,  are we going to see map reduce based
 queries?

 Can you please describe what you mean by that?  Maybe with an example.

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/



  On Oct 25, 2013 10:03 PM, Jack Krupansky j...@basetechnology.com
 wrote:
 
  But a lot of that big yellow elephant stuff is in 4.x anyway.
 
  (Otis: I was afraid that you were going to say that the next big thing
 in
  Solr is... Elasticsearch!)
 
  -- Jack Krupansky
 
  -Original Message- From: Otis Gospodnetic
  Sent: Friday, October 25, 2013 2:43 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Solr - what's the next big thing?
 
  Saar,
 
  The marriage with the big yellow elephant is a big deal. It changes the
  scale.
 
  Otis
  Solr  ElasticSearch Support
  http://sematext.com/
  On Oct 25, 2013 5:32 AM, Saar Carmi saarca...@gmail.com wrote:
 
   If I am not mistaken the most impressive improvement of Solr 4.0
 compared
  to previous versions was the Solr Cloud architecture.
 
  What would be the next big thing in Solr 5.0 ?
 
  Saar
 
 
 




 --
 Saar Carmi

 Mobile: 054-7782417
 Email: saarca...@gmail.com


Re: Need idea to standardize keywords - ring tone vs ringtone

2013-10-28 Thread Developer
Thanks for your response Eric. Sorry for the confusion.

I currently display both 'ring tone' as well as 'ringtone' when the user
types in 'r' but I am trying to figure out a way to display just 'ringtone'
hence I added 'ring tone' to stopwords list so that it doesn't get indexed.

I have the list of know keywords (more like synonyms) which I am trying to
map against the user entered keywords.

ring tone, ringer tine = ringtone





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Need-idea-to-standardize-keywords-ring-tone-vs-ringtone-tp4097794p4098103.html
Sent from the Solr - User mailing list archive at Nabble.com.


Replace document title with filename if it's empty

2013-10-28 Thread Bayu Widyasanyata
Hi,

I just found that some of PDFs files crawled has no (empty) 'title'
metadata.
How to define or fetch the filename, and use it (filename) replacing empty
'title' field?

I didn't found filename field on schema.xml, and don't know how to make
conditional for above conditions (if title is empty then ).

Thanks in advance.

-- 
wassalam,
[bayu]


Re: Apache-Solr with Tomcat: displaying the format of search result

2013-10-28 Thread Shawn Heisey
On 10/28/2013 4:40 AM, pyramesh wrote:
 But this is not I want.. I want to display data as same as input format.
 
 can anyone please help on this

What Solr outputs in its fields for search results is identical to what
it receives when data is indexed, unless you have update processors
configured that change the data.  The analysis chain that you define in
schema.xml is *NOT* applied to stored data, only indexed data.

If the search results are not coming out in the format that you want, it
is either arriving at Solr incorrectly, or you have one or more update
processors that are changing it.

Thanks,
Shawn



Re: Need idea to standardize keywords - ring tone vs ringtone

2013-10-28 Thread Jonathan Rochkind
Do you know about the Solr synonym feature?  That seems more applicable 
to what you're describing then stopwords. I'd stay away from stopwords 
entirely here, and try to do what you want with synonyms.


Multi-word synonyms can be tricky, I'm not entirely sure the right way 
to do it for this use case. But I think the synonym feature is what you 
want. Not the stopwords feature.




On 10/28/13 12:24 PM, Developer wrote:

Thanks for your response Eric. Sorry for the confusion.

I currently display both 'ring tone' as well as 'ringtone' when the user
types in 'r' but I am trying to figure out a way to display just 'ringtone'
hence I added 'ring tone' to stopwords list so that it doesn't get indexed.

I have the list of know keywords (more like synonyms) which I am trying to
map against the user entered keywords.

ring tone, ringer tine = ringtone





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Need-idea-to-standardize-keywords-ring-tone-vs-ringtone-tp4097794p4098103.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When is/should qf different from pf?

2013-10-28 Thread Amit Nithian
Thanks Erick. Numeric fields make sense as I guess would strictly string
fields too since its one  term? In the normal text searching case though
does it make sense to have qf and pf differ?

Thanks
Amit
On Oct 28, 2013 3:36 AM, Erick Erickson erickerick...@gmail.com wrote:

 The facetious answer is when phrases aren't important in the fields.
 If you're doing a simple boolean match, adding phrase fields will add
 expense, to no good purpose etc. Phrases on numeric
 fields seems wrong.

 FWIW,
 Erick


 On Mon, Oct 28, 2013 at 1:03 AM, Amit Nithian anith...@gmail.com wrote:

  Hi all,
 
  I have been using Solr for years but never really stopped to wonder:
 
  When using the dismax/edismax handler, when do you have the qf different
  from the pf?
 
  I have always set them to be the same (maybe different weights) but I was
  wondering if there is a situation where you would have a field in the qf
  not in the pf or vice versa.
 
  My understanding from the docs is that qf is a term-wise hard filter
 while
  pf is a phrase-wise boost of documents who made it past the qf filter.
 
  Thanks!
  Amit
 



Re: When is/should qf different from pf?

2013-10-28 Thread Upayavira
There'd be no point having them the same.

You're likely to include boosts in your pf, so that docs that match the
phrase query as well as the term query score higher than those that just
match the term query.

Such as:

  qf=text descriptionpf=text^2 description^4

Upayavira

On Mon, Oct 28, 2013, at 05:44 PM, Amit Nithian wrote:
 Thanks Erick. Numeric fields make sense as I guess would strictly string
 fields too since its one  term? In the normal text searching case though
 does it make sense to have qf and pf differ?
 
 Thanks
 Amit
 On Oct 28, 2013 3:36 AM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  The facetious answer is when phrases aren't important in the fields.
  If you're doing a simple boolean match, adding phrase fields will add
  expense, to no good purpose etc. Phrases on numeric
  fields seems wrong.
 
  FWIW,
  Erick
 
 
  On Mon, Oct 28, 2013 at 1:03 AM, Amit Nithian anith...@gmail.com wrote:
 
   Hi all,
  
   I have been using Solr for years but never really stopped to wonder:
  
   When using the dismax/edismax handler, when do you have the qf different
   from the pf?
  
   I have always set them to be the same (maybe different weights) but I was
   wondering if there is a situation where you would have a field in the qf
   not in the pf or vice versa.
  
   My understanding from the docs is that qf is a term-wise hard filter
  while
   pf is a phrase-wise boost of documents who made it past the qf filter.
  
   Thanks!
   Amit
  
 


Solr block join

2013-10-28 Thread Simon
Hi,

The block join feature introduced in Solr 4.5 is really helpful in solving
some of the issues in my project. I am able to get it working in simple
cases. However, I couldn't figure out how to use it in some more complex
cases and I could find very little reference about it.
1) how to return both parent documents fields  and child document fields in
same result (in Solrj )?
2) how to apply 'OR' to multiple child documents types (searching for
documents that meet conditions of either child document type 1 or child
document type2)?
3) if result/sort/facet fields coming from child documents, how to define
them in schema? What I can think about is to create a copyField for each
them in parent documents. Is there any better way?
4) is block join working for multiple child level such child, grandchild
documents etc?

Does anyone have had similar issues and would like to share your solutions?

Thanks,
Simon



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-block-join-tp4098128.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.5.1 replication Bug? Illegal to have multiple roots (start tag in epilog?).

2013-10-28 Thread Michael Tracey
Hey, this is Michael, who was having the exact error on the Jetty side with an 
update.  I've upgraded jetty from the 4.5.1 embedded version (in the example 
directory) to version 9.0.6, which means I had to upgrade my OpenJDK from 1.6 
to 1.7.0_45.  Also, I added the suggested (very large) settings to my 
solrconfig.xml: 

requestParsers enableRemoteStreaming=true formdataUploadLimitInKB=2048000 
multipartUploadLimitInKB=2048000 /

but I am still getting the errors when I put a second server in the cloud. 
Single servers (external zookeeper, but no cloud partner) works just fine.

I suppose my next step is to try Tomcat, but according to your post, it will 
not help!

Any help is appreciated,

M.

- Original Message -
From: Sai Gadde gadde@gmail.com
To: solr-user@lucene.apache.org
Sent: Monday, October 28, 2013 7:10:41 AM
Subject: Solr 4.5.1 replication Bug? Illegal to have multiple roots (start tag 
in epilog?).

we have a similar error as this thread.

http://www.mail-archive.com/solr-user@lucene.apache.org/msg90748.html

Tried tomcat setting from this post. We used exact setting sepecified
here. we merge 500 documents at a time. I am creating a new thread
because Michael is using Jetty where as we use Tomcat.


formdataUploadLimitInKB and multipartUploadLimitInKB limits are set to very
high value 2GB. As suggested in the following thread.
https://issues.apache.org/jira/browse/SOLR-5331


We use out of the box Solr 4.5.1 no customization done. If we merge
documents via SolrJ to a single server it is perfectly working fine.


 But as soon as we add another node to the cloud we are getting
following while merging documents.



This is the error we are getting on the server (10.10.10.116 - IP is
irrelavent just for clarity)where merging is happening. 10.10.10.119
is the new node here. This server gets RemoteSolrException


shard update error StdNode:
http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Illegal to have multiple roots (start tag in epilog?).
 at [row,col {unknown-source}]: [1,12468]
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
at 
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)





On the other server 10.10.10.119 we get following error


org.apache.solr.common.SolrException: Illegal to have multiple roots
(start tag in epilog?).
 at [row,col {unknown-source}]: [1,12468]
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:936)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at 

Re: Compound words

2013-10-28 Thread Parvesh Garg
Hi Roman, thanks for the link, will go through it.

Erick, will try with expand=true once and check out the results. Will
update this thread with the findings. I remember we rejected expand=true
because of some weird spaghetti problem. Will check it out again.

Thanks,

Parvesh Garg
http://www.zettata.com


On Mon, Oct 28, 2013 at 9:01 PM, Roman Chyla roman.ch...@gmail.com wrote:

 Hi Parvesh,
 I think you should check the following jira
 https://issues.apache.org/jira/browse/SOLR-5379. You will find there links
 to other possible solutions/problems:-)
 Roman
 On 28 Oct 2013 09:06, Erick Erickson erickerick...@gmail.com wrote:

  Consider setting expand=true at index time. That
  puts all the tokens in your index, and then you
  may not need to have any synonym
  processing at query time since all the variants will
  already be in the index.
 
  As it is, you've replaced the words in the original with
  synonyms, essentially collapsed them down to a single
  word and then you have to do something at query time
  to get matches. If all the variants are in the index, you
  shouldn't have to. That's what I meant by raw.
 
  Best,
  Erick
 
 
  On Mon, Oct 28, 2013 at 8:02 AM, Parvesh Garg parv...@zettata.com
 wrote:
 
   Hi Erick,
  
   Thanks for the suggestion. Like I said, I'm an infant.
  
   We tried synonyms both ways. sea biscuit = seabiscuit and seabiscuit
 =
   sea biscuit and didn't understand exactly how it worked. But I just
  checked
   the analysis tool, and it seems to work perfectly fine at index time.
  Now,
   I can happily discard my own filter and 4 days of work. I'm happy I got
  to
   know a few ways on how/when not to write a solr filter :)
  
   I tried the string sea biscuit sea bird with expand=false and the
  tokens
   i got were seabiscuit sea bird at 1,2 and 3 positions respectively. But
  at
   query time, when I enter the same term sea biscuit sea bird, using
   edismax and qf, pf2, and pf3, the parsedQuery looks like this:
  
   +((text:sea) (text:biscuit) (text:sea) (text:bird)) ((text:\biscuit
  sea\)
   (text:\sea bird\)) ((text:\seabiscuit sea\) (text:\biscuit sea
   bird\))
  
   What I wanted instead was this
  
   +((text:seabiscuit) (text:sea) (text:bird)) ((text:\seabiscuit sea\)
   (text:\sea bird\)) (text:\seabiscuit sea bird\)
  
   Looks like there isn't any other way than to pre-process query myself
 and
   create the compound word. What do you mean by just query the raw
  string?
   Am I still missing something?
  
   Parvesh Garg
   http://www.zettata.com
   (This time I did remove my phone number :) )
  
   On Mon, Oct 28, 2013 at 4:14 PM, Erick Erickson 
 erickerick...@gmail.com
   wrote:
  
Why did you reject using synonyms? You can have multi-word
synonyms just fine at index time, and at query time, since the
multiple words are already substituted in the index you don't
need to do the same substitution, just query the raw strings.
   
I freely acknowledge you may have very good reasons for doing
this yourself, I'm just making sure you know what's already
there.
   
See:
   
   
  
 
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
   
Look particularly at the explanations for sea biscuit in that
  section.
   
Best,
Erick
   
   
   
On Mon, Oct 28, 2013 at 3:47 AM, Parvesh Garg parv...@zettata.com
   wrote:
   
 One more thing, Is there a way to remove my accidentally sent
 phone
number
 in the signature from the previous mail? aarrrggghhh

   
  
 



Single multilingual field analyzed based on other field values

2013-10-28 Thread David Anthony Troiano
Hello,

First some background...

I am indexing a multilingual document set where documents themselves can
contain multiple languages.  The language(s) within my documents are known
ahead of time.  I have tried separate fields per language, and due to the
poor query performance I'm seeing with that approach (many languages /
fields), I'm trying to create a single multilingual field.

One approach to this problem is given in Section
14.6.4https://docs.google.com/a/basistech.com/file/d/0B3NlE_uL0pqwR0hGV0M1QXBmZm8/editof
the new Solr In Action book.  The approach is to take the document
content field and prepend it with the list contained languages followed by
a special delimiter.  A new field type is defined that maps languages to
sub field types, and the new type's tokenizer then runs all of the sub
field type analyzers over the field and merges results, adjusts offsets for
the prepended data, etc.

Due to the tokenizer complexity incurred, I'd like to pursue a more
flexible approach, which is to run the various language-specific analyzers
not based on prepended codes, but instead based on other field values
(i.e., a language field).

I don't see a straightforward way to do this, mostly because a field
analyzer doesn't have access to the rest of the document.  On the flip
side, an UpdateRequestProcessor would have access to the document but
doesn't really give a path to wind up where I want to be (single field with
different analyzers run dynamically).

Finally, my question: is it possible to thread cache document language(s)
during UpdateRequestProcessor execution (where we have access to the full
document), so that the analyzer can then read from the cache to determine
which analyzer(s) to run?  More specifically, if a document is run through
it's URP chain on thread T, will its analyzer(s) also run on thread T and
will no other documents be run through the URP on that thread in the
interim?

Thanks,
Dave


Re: Single multilingual field analyzed based on other field values

2013-10-28 Thread Jack Krupansky
Consider an update processor - it can operate on any field and has access to 
all fields.


You could have one update processor to combine all the fields to process, 
into a temporary, dummy field. Then run a language detection update 
processor on the combined field. Then process the results and place in the 
desired field. And finally remove any temporary fields.


-- Jack Krupansky
-Original Message- 
From: David Anthony Troiano

Sent: Monday, October 28, 2013 4:47 PM
To: solr-user@lucene.apache.org
Subject: Single multilingual field analyzed based on other field values

Hello,

First some background...

I am indexing a multilingual document set where documents themselves can
contain multiple languages.  The language(s) within my documents are known
ahead of time.  I have tried separate fields per language, and due to the
poor query performance I'm seeing with that approach (many languages /
fields), I'm trying to create a single multilingual field.

One approach to this problem is given in Section
14.6.4https://docs.google.com/a/basistech.com/file/d/0B3NlE_uL0pqwR0hGV0M1QXBmZm8/editof
the new Solr In Action book.  The approach is to take the document
content field and prepend it with the list contained languages followed by
a special delimiter.  A new field type is defined that maps languages to
sub field types, and the new type's tokenizer then runs all of the sub
field type analyzers over the field and merges results, adjusts offsets for
the prepended data, etc.

Due to the tokenizer complexity incurred, I'd like to pursue a more
flexible approach, which is to run the various language-specific analyzers
not based on prepended codes, but instead based on other field values
(i.e., a language field).

I don't see a straightforward way to do this, mostly because a field
analyzer doesn't have access to the rest of the document.  On the flip
side, an UpdateRequestProcessor would have access to the document but
doesn't really give a path to wind up where I want to be (single field with
different analyzers run dynamically).

Finally, my question: is it possible to thread cache document language(s)
during UpdateRequestProcessor execution (where we have access to the full
document), so that the analyzer can then read from the cache to determine
which analyzer(s) to run?  More specifically, if a document is run through
it's URP chain on thread T, will its analyzer(s) also run on thread T and
will no other documents be run through the URP on that thread in the
interim?

Thanks,
Dave 



Re: Index JTS Point in Solr/Lucene index

2013-10-28 Thread David Smiley (@MITRE.org)
Just follow-ing up with this thread after a round of emails between Shahbaz
and I…


David Smiley wrote
 Ooooh, I see your confusion.  You looked at code in an
 UpdateRequestProcessor and expected it to work on the client in SolrJ.  It
 won't work for the reason that the code in the URP is creating a
 non-string object (a Shape subclass) whereas SolrJ expects Strings or
 numbers.  You need to use Shape formatted strings.  If you have a generic
 Shape and want to serialize it to a String without special casing Point,
 etc., then you can use SpatialContext.toString(shape).


Shahbaz lodhi wrote
 Hi,
 
 *Story:*
 I am trying to index *JTS point* in following format; not successful
 though:
 Pt(x=55.76056,y=24.19167)
 It is the format that i get by ctx.readShape( shapeString ).
 
 I don't get any error at reading shape or adding shape
 to solrInputDocument but prompts *error reading WKT* on adding document
 to solr (i.e. solrServer.add(solrInputDocument)).
 *
 *
 
 *Question:*
 Is it a legal way to index:
 solrInputDocument.addField(myGeoField,
 JtsSpatialContext.GEO.readShape(shapeString));
 solr.add(solrInputDocument);
 
  or I'll have to stick to the WKT format.
 
 
 
 
 Any help will be highly appreciated.
 
 Thanks,
 
 Shahbaz





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-JTS-Point-in-Solr-Lucene-index-tp4095395p4098139.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Global User defined properties - solr.xml from Solr 4.4 to Solr 4.5

2013-10-28 Thread marotosg
Done
https://issues.apache.org/jira/browse/SOLR-5398



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Global-User-defined-properties-solr-xml-from-Solr-4-4-to-Solr-4-5-tp4097740p4098143.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Background merge errors with Solr 4.4.0 on Optimize call

2013-10-28 Thread Matthew Shapiro
Thanks for your response.

You were right, solr is logging to the catalina.out file for tomcat.  When
I click the optimize button in solr's admin interface the following logs
are written: http://apaste.info/laup

About JVM memory, solr's admin interface is listing JVM memory at 3.1%
(221.7MB is dark grey, 512.56MB light grey and 6.99GB total).


On Mon, Oct 28, 2013 at 6:29 AM, Erick Erickson erickerick...@gmail.comwrote:

 For Tomcat, the Solr is often put into catalina.out
 as a default, so the output might be there. You can
 configure Solr to send the logs most anywhere you
 please, but without some specific setup
 on your part the log output just goes to the default
 for the servlet.

 I took a quick glance at the code but since the merges
 are happening in the background, there's not much
 context for where that error is thrown.

 How much memory is there for the JVM? I'm grasping
 at straws a bit...

 Erick


 On Sun, Oct 27, 2013 at 9:54 PM, Matthew Shapiro m...@mshapiro.net wrote:

  I am working at implementing solr to work as the search backend for our
 web
  system.  So far things have been going well, but today I made some schema
  changes and now things have broken.
 
  I updated the schema.xml file and reloaded the core (via the admin
  interface).  No errors were reported in the logs.
 
  I then pushed 100 records to be indexed.  A call to Commit afterwards
  seemed fine, however my next call for Optimize caused the following
 errors:
 
  java.io.IOException: background merge hit exception:
  _2n(4.4):C4263/154 _30(4.4):C134 _32(4.4):C10 _31(4.4):C10 into _37
  [maxNumSegments=1]
 
  null:java.io.IOException: background merge hit exception:
  _2n(4.4):C4263/154 _30(4.4):C134 _32(4.4):C10 _31(4.4):C10 into _37
  [maxNumSegments=1]
 
 
  Unfortunately, googling for background merge hit exception came up
  with 2 thing: a corrupt index or not enough free space.  The host
  machine that's hosting solr has 227 out of 229GB free (according to df
  -h), so that's not it.
 
 
  I then ran CheckIndex on the index, and got the following results:
  http://apaste.info/gmGU
 
 
  As someone who is new to solr and lucene, as far as I can tell this
  means my index is fine. So I am coming up at a loss. I'm somewhat sure
  that I could probably delete my data directory and rebuild it but I am
  more interested in finding out why is it having issues, what is the
  best way to fix it, and what is the best way to prevent it from
  happening when this goes into production.
 
 
  Does anyone have any advice that may help?
 
 
  As an aside, i do not have a stacktrace for you because the solr admin
  page isn't giving me one.  I tried looking in my logs file in my solr
  directory, but it does not contain any logs.  I opened up my
  ~/tomcat/lib/log4j.properties file and saw http://apaste.info/0rTL,
  which didnt really help me find log files.  Doing a 'find . | grep
  solr.log' didn't really help either.  Any help for finding log files
  (which may help find the actual cause of this) would also be
  appreciated.
 



Re: Background merge errors with Solr 4.4.0 on Optimize call

2013-10-28 Thread Matthew Shapiro
Sorry for reposting after I just sent in a reply, but I just looked at the
error trace closer and noticed


   1. Caused by: java.lang.IllegalArgumentException: no such field what


The 'what' field was removed by request of the customer as they wanted the
logic behind what gets queried in the what field to be code side instead
of solr side (for easier changing without having to re-index everything.  I
didn't feel strongly either way and since they are paying me, I took it
out).

This makes me wonder if its crashing while merging because a field that
used to be there is now gone.  However, this seems odd to me as Solr
doesn't even let me delete the old data and instead its leaving my
collection in an extremely bad state, with the only remedy I can think of
is to nuke the index at the filesystem level.

If this is indeed the cause of the crash, is the only way to delete a field
to first completely empty your index first?


On Mon, Oct 28, 2013 at 6:34 PM, Matthew Shapiro m...@mshapiro.net wrote:

 Thanks for your response.

 You were right, solr is logging to the catalina.out file for tomcat.  When
 I click the optimize button in solr's admin interface the following logs
 are written: http://apaste.info/laup

 About JVM memory, solr's admin interface is listing JVM memory at 3.1%
 (221.7MB is dark grey, 512.56MB light grey and 6.99GB total).


 On Mon, Oct 28, 2013 at 6:29 AM, Erick Erickson 
 erickerick...@gmail.comwrote:

 For Tomcat, the Solr is often put into catalina.out
 as a default, so the output might be there. You can
 configure Solr to send the logs most anywhere you
 please, but without some specific setup
 on your part the log output just goes to the default
 for the servlet.

 I took a quick glance at the code but since the merges
 are happening in the background, there's not much
 context for where that error is thrown.

 How much memory is there for the JVM? I'm grasping
 at straws a bit...

 Erick


 On Sun, Oct 27, 2013 at 9:54 PM, Matthew Shapiro m...@mshapiro.net wrote:

  I am working at implementing solr to work as the search backend for our
 web
  system.  So far things have been going well, but today I made some
 schema
  changes and now things have broken.
 
  I updated the schema.xml file and reloaded the core (via the admin
  interface).  No errors were reported in the logs.
 
  I then pushed 100 records to be indexed.  A call to Commit afterwards
  seemed fine, however my next call for Optimize caused the following
 errors:
 
  java.io.IOException: background merge hit exception:
  _2n(4.4):C4263/154 _30(4.4):C134 _32(4.4):C10 _31(4.4):C10 into _37
  [maxNumSegments=1]
 
  null:java.io.IOException: background merge hit exception:
  _2n(4.4):C4263/154 _30(4.4):C134 _32(4.4):C10 _31(4.4):C10 into _37
  [maxNumSegments=1]
 
 
  Unfortunately, googling for background merge hit exception came up
  with 2 thing: a corrupt index or not enough free space.  The host
  machine that's hosting solr has 227 out of 229GB free (according to df
  -h), so that's not it.
 
 
  I then ran CheckIndex on the index, and got the following results:
  http://apaste.info/gmGU
 
 
  As someone who is new to solr and lucene, as far as I can tell this
  means my index is fine. So I am coming up at a loss. I'm somewhat sure
  that I could probably delete my data directory and rebuild it but I am
  more interested in finding out why is it having issues, what is the
  best way to fix it, and what is the best way to prevent it from
  happening when this goes into production.
 
 
  Does anyone have any advice that may help?
 
 
  As an aside, i do not have a stacktrace for you because the solr admin
  page isn't giving me one.  I tried looking in my logs file in my solr
  directory, but it does not contain any logs.  I opened up my
  ~/tomcat/lib/log4j.properties file and saw http://apaste.info/0rTL,
  which didnt really help me find log files.  Doing a 'find . | grep
  solr.log' didn't really help either.  Any help for finding log files
  (which may help find the actual cause of this) would also be
  appreciated.
 





Solr 4.5.1 Overseer error

2013-10-28 Thread dboychuck
I am upgrading from 4.4 to 4.5.1

 I used to just upload my configurations to zookeeper and then install solr
with no default core
Solr would give me an error that no cores were created when I tried to
access until I ran the collections API create command to make a collection

however now when I try to install solr with no default core I get a generic
error about path cannot end with / and I can't create the cores using the
collections api

when I manually copy the files over and create the core through the
interface it all works as expected 
any help would be appreciated

Here is the error i'm seeing
http://pastebin.com/cEfpSEqe

here's my solr.xml
http://pastebin.com/kBLv9Vvt

and here are my startup arguments
http://pastebin.com/7tCrSpX9





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-5-1-Overseer-error-tp4098160.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Single multilingual field analyzed based on other field values

2013-10-28 Thread Trey Grainger
Hi David,

What version of the Solr in Action MEAP are you looking at (current version
is 12, and version 13 is coming out later this week, and prior versions had
significant bugs in the code you are referencing)?  I added an update
processor in the most recent version that can do language identification
and prepend the language codes for you (even removing them from the stored
version of the field and only including them on the indexed version for
text analysis).

You could easily modify this update processor to read the value from the
language field and use it as the basis of the pre-pended languages.

Otherwise, if you want to do language detection instead of passing in the
language manually, MultiTextField in chapter 14 of Solr in Action and the
corresponding MultiTextFieldLanguageIdentifierUpdateProcessor should handle
all of the language detection and pre-pending automatically for you (and
also append the identified language to a separate field).

If it were easy/possible to have access to the rest of the fields in the
document from within a field's Analyzer then I would have certainly opted
for that approach instead of the whole pre-pending languages to content
option.  If it is too cumbersome, you could probably rewrite the
MultiTextField to pull the language from the field name instead of the
content  (i.e.  field name=myField|en,frblah, blah/field instead of
field name=myFielden,fr|blah, blah/field as currently designed).
 This would make specifying the language much easier (especially at query
time since you only have to specify the languages once instead of on each
term), and you could have Solr still search the same underlying field for
all languages.  Same general idea, though.

In terms of your ThreadLocal cache idea... that sounds really scary to me.
 The Analyzers' TokenStreamComponents are cached in a ThreadLocal context
depending upon to the internal ReusePolicy, and I'm skeptical that you'll
be able to pull this off cleanly.  It would really be hacking around the
Lucene API's even if you were able to pull it off.

-Trey


On Mon, Oct 28, 2013 at 5:15 PM, Jack Krupansky j...@basetechnology.comwrote:

 Consider an update processor - it can operate on any field and has access
 to all fields.

 You could have one update processor to combine all the fields to process,
 into a temporary, dummy field. Then run a language detection update
 processor on the combined field. Then process the results and place in the
 desired field. And finally remove any temporary fields.

 -- Jack Krupansky
 -Original Message- From: David Anthony Troiano
 Sent: Monday, October 28, 2013 4:47 PM
 To: solr-user@lucene.apache.org
 Subject: Single multilingual field analyzed based on other field values


 Hello,

 First some background...

 I am indexing a multilingual document set where documents themselves can
 contain multiple languages.  The language(s) within my documents are known
 ahead of time.  I have tried separate fields per language, and due to the
 poor query performance I'm seeing with that approach (many languages /
 fields), I'm trying to create a single multilingual field.

 One approach to this problem is given in Section
 14.6.4https://docs.google.**com/a/basistech.com/file/d/**
 0B3NlE_uL0pqwR0hGV0M1QXBmZm8/**edithttps://docs.google.com/a/basistech.com/file/d/0B3NlE_uL0pqwR0hGV0M1QXBmZm8/edit
 of
 the new Solr In Action book.  The approach is to take the document
 content field and prepend it with the list contained languages followed by
 a special delimiter.  A new field type is defined that maps languages to
 sub field types, and the new type's tokenizer then runs all of the sub
 field type analyzers over the field and merges results, adjusts offsets for
 the prepended data, etc.

 Due to the tokenizer complexity incurred, I'd like to pursue a more
 flexible approach, which is to run the various language-specific analyzers
 not based on prepended codes, but instead based on other field values
 (i.e., a language field).

 I don't see a straightforward way to do this, mostly because a field
 analyzer doesn't have access to the rest of the document.  On the flip
 side, an UpdateRequestProcessor would have access to the document but
 doesn't really give a path to wind up where I want to be (single field with
 different analyzers run dynamically).

 Finally, my question: is it possible to thread cache document language(s)
 during UpdateRequestProcessor execution (where we have access to the full
 document), so that the analyzer can then read from the cache to determine
 which analyzer(s) to run?  More specifically, if a document is run through
 it's URP chain on thread T, will its analyzer(s) also run on thread T and
 will no other documents be run through the URP on that thread in the
 interim?

 Thanks,
 Dave



Re: Solr 4.5.1 Overseer error

2013-10-28 Thread Shawn Heisey

On 10/28/2013 5:50 PM, dboychuck wrote:

I am upgrading from 4.4 to 4.5.1

  I used to just upload my configurations to zookeeper and then install solr
with no default core
Solr would give me an error that no cores were created when I tried to
access until I ran the collections API create command to make a collection

however now when I try to install solr with no default core I get a generic
error about path cannot end with / and I can't create the cores using the
collections api

when I manually copy the files over and create the core through the
interface it all works as expected
any help would be appreciated


Working on IRC, we were able to track this down to a work item in the 
overseer queue in zookeeper.  It had a deletecore operation in the 
queue with the collection parameter set to an empty string.


{
operation:deletecore,
core_node_name:solr-shard-1.REDACTED.com:__collection1,
  core:collection1,
  collection:,
node_name:solr-shard-1.REDACTED.com:_}

Basically, the previous version left behind some bad data in zookeeper.  
When dboychuck wiped out all the zookeeper data and started over, it all 
worked.


If you are seeing Path must not end with / character error when 
starting Solr, you may have some bad data in the overseer queue, which 
is located in zookeeper.


Would it be worthwhile to file a bug so Solr can deal with these 
problems automatically and log what it's doing, or at the very least 
output a better error message?


Thanks,
Shawn



Re: how to avoid recover? how to ensure a recover success?

2013-10-28 Thread deniz
I have had a similar problem before but the patch which was included with the
version 4.1 fixed that... I couldnt reproduce the problem with the patch... 

anyone is able to reproduce this exception?



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-avoid-recover-how-to-ensure-a-recover-success-tp4096777p4098166.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.5.1 replication Bug? Illegal to have multiple roots (start tag in epilog?).

2013-10-28 Thread Sai Gadde
Hi Michael,

I downgraded to Solr 4.4.0 and this issue is gone. No additional settings
or tweaks are done.

This is not a fix or solution I guess but, in our case we wanted something
working and we were running out of time.

I will watch this thread if there are any suggestions but, possibly we will
stay with 4.4.0 for sometime.

Regards
Sai


On Tue, Oct 29, 2013 at 4:36 AM, Michael Tracey mtra...@biblio.com wrote:

 Hey, this is Michael, who was having the exact error on the Jetty side
 with an update.  I've upgraded jetty from the 4.5.1 embedded version (in
 the example directory) to version 9.0.6, which means I had to upgrade my
 OpenJDK from 1.6 to 1.7.0_45.  Also, I added the suggested (very large)
 settings to my solrconfig.xml:

 requestParsers enableRemoteStreaming=true
 formdataUploadLimitInKB=2048000 multipartUploadLimitInKB=2048000 /

 but I am still getting the errors when I put a second server in the cloud.
 Single servers (external zookeeper, but no cloud partner) works just fine.

 I suppose my next step is to try Tomcat, but according to your post, it
 will not help!

 Any help is appreciated,

 M.

 - Original Message -
 From: Sai Gadde gadde@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Monday, October 28, 2013 7:10:41 AM
 Subject: Solr 4.5.1 replication Bug? Illegal to have multiple roots
 (start tag in epilog?).

 we have a similar error as this thread.

 http://www.mail-archive.com/solr-user@lucene.apache.org/msg90748.html

 Tried tomcat setting from this post. We used exact setting sepecified
 here. we merge 500 documents at a time. I am creating a new thread
 because Michael is using Jetty where as we use Tomcat.


 formdataUploadLimitInKB and multipartUploadLimitInKB limits are set to very
 high value 2GB. As suggested in the following thread.
 https://issues.apache.org/jira/browse/SOLR-5331


 We use out of the box Solr 4.5.1 no customization done. If we merge
 documents via SolrJ to a single server it is perfectly working fine.


  But as soon as we add another node to the cloud we are getting
 following while merging documents.



 This is the error we are getting on the server (10.10.10.116 - IP is
 irrelavent just for clarity)where merging is happening. 10.10.10.119
 is the new node here. This server gets RemoteSolrException


 shard update error StdNode:

 http://10.10.10.119:8980/solr/mycore/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException
 :
 Illegal to have multiple roots (start tag in epilog?).
  at [row,col {unknown-source}]: [1,12468]
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:425)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:401)
 at
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:1)
 at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
 at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
 Source)
 at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
 at java.util.concurrent.FutureTask.run(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
 Source)
 at java.lang.Thread.run(Unknown Source)





 On the other server 10.10.10.119 we get following error


 org.apache.solr.common.SolrException: Illegal to have multiple roots
 (start tag in epilog?).
  at [row,col {unknown-source}]: [1,12468]
 at
 org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:176)
 at
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
 at
 

Re: Apache-Solr with Tomcat: displaying the format of search result

2013-10-28 Thread pyramesh

Thanks  Shawn for quick response...

As suggested, I verified my configuration  to check whether the update
processors configured or not and found no processors configured. I am just
wonder how the format getting changed.

Let explain my problem in details

I am indexing the .xml file to solr. and below is field configuration.

* schema.xml* (giving for a field)
=

*1. Filed::* 

field name=Resolution type=text_general indexed=true
multiValued=true stored=true/

*2. Field Type  tokenizers*

fieldType name=text_general class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /

filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1 
maxGramSize=25 side=front/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /

filter class=solr.SynonymFilterFactory 
synonyms=synonyms.txt
ignoreCase=true expand=false /
filter class=solr.PositionFilterFactory /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.WordDelimiterFilterFactory 
catenateAll=1/
  /analyzer
/fieldType



3. *before doing the index, data is in below format*

field name=Resolution*Issue:* ID country user X; unable xxx
wsdfsdfs sdsdfs

*Impact / Suspected Impact*: asa asdasdaav asdffcasdfassd

*Rootcause:* asdfas asdfasdwersdvsdv sdfsdfcss 

(1). test 12, (2).tesst 123/field

4. *After index the  data* the data is displaying in the below format 

*Issue:* ID country user X; unable xxx wsdfsdfs sdsdfs*Impact /
Suspected Impact*: asa asdasdaav asdffcasdfassd *Rootcause:* asdfas
asdfasdwersdvsdv sdfsdfcss (1). test 12, (2).tesst 123


Could you please guide me to display the search result as same as input
format even after index..
if any update processors need to add what processors have to add.. 
Thanks in Advance 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Apache-Solr-with-Tomcat-displaying-the-input-format-in-the-search-results-tp4098040p4098183.html
Sent from the Solr - User mailing list archive at Nabble.com.