Re: Bug in Collapsing QParserPlugin : Sort by 3 or more fields is broken

2014-06-19 Thread Umesh Prasad
Continuing the discussion on mailing list from Jira.

An Example


*id  group   f1  f2*1   g1
5   10
2   g1 5   1000
3   g1 5   1000
4   g1 10  100
5   g2 5   10
6   g2 5   1000
7   g2 5   1000
8   g210  100

sort= f1 asc, f2 desc , id desc


*Without collapse will give : *
(7,g2), (6,g2),  (3,g1), (2,g1), (5,g2), (1,g1), (8,g2), (4,g1)


*On collapsing by group_s  expected output is : *  (7,g2), (3,g1)

solr standard collapsing does give this output  with
group=on,group.field=group_s,group.main=true

* Collapsing with CollapsingQParserPlugin* fq={!collapse field=group_s} :
  (5,g2), (1,g1)



* Summarizing Jira Discussion :*
1. CollapsingQParserPlugin picks up the group heads from matching results
and passes those further. So in essence filtering some of the matching
documents, so that subsequent collectors never see them. It can also pass
on score to subsequent collectors using a dummy scorer.

2. TopDocCollector comes later in hierarchy and it will sort on the
collapsed set. That works fine.

The issue is with step 1. Collapsing is done by a single comparator which
can take its value from a field or function. It defaults to score.
Function queries do allow us to combine multiple fields / value sources,
however it would be difficult to construct a function for given sort
fields. Primarily because
a) The range of values for a given sort field is not known in advance.
It is possible for one sort field to unbounded, but other to be bounded
within a small range.
b) The sort field can itself hold custom logic.

Because of (a) the group head selected by CollapsingQParserPlugin will be
incorrect and subsequent sorting will break.



On 14 June 2014 12:38, Umesh Prasad umesh.i...@gmail.com wrote:

 Thanks Joel for the quick response. I have opened a new jira ticket.

 https://issues.apache.org/jira/browse/SOLR-6168




 On 13 June 2014 17:45, Joel Bernstein joels...@gmail.com wrote:

 Let's open a new ticket.

 Joel Bernstein
 Search Engineer at Heliosearch


 On Fri, Jun 13, 2014 at 8:08 AM, Umesh Prasad umesh.i...@gmail.com
 wrote:

  The patch in SOLR-5408 fixes the issue with sorting only for two sort
  fields. Sorting still breaks when 3 or more sort fields are used.
 
  I have attached a test case, which demonstrates the broken behavior
 when 3
  sort fields are used.
 
  The failing test case patch is against Lucene/Solr 4.7 revision  number
  1602388
 
  Can someone apply and verify the bug ?
 
  Also, should I re-open SOLR-5408  or open a new ticket ?
 
 
  ---
  Thanks  Regards
  Umesh Prasad
 




 --
 ---
 Thanks  Regards
 Umesh Prasad




-- 
---
Thanks  Regards
Umesh Prasad


Re: Warning message logs on startup after upgrading to 4.8.1

2014-06-19 Thread Marius Dumitru Florea
On Thu, Jun 19, 2014 at 12:49 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : WARN  o.a.s.r.ManagedResource- No stored data found for
 : /schema/analysis/stopwords/english
 : WARN  o.a.s.r.ManagedResource- No stored data found for
 : /schema/analysis/synonyms/english
 :
 : I fixed these by commenting out the managed_en field type in my
 : schema, see 
 https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L486


 FWIW: Unless i'm missing something, you should have only gotten those
 warnings in the situation where you started using the 4.8
 example schema.xml (or cut/pasted those from the 4.8 into your existing
 schema) but you didn't use the rest of the cof files that came with 4.8 --
 so you didn't have the stored data JSON file that goes with it -- in which
 case that is a legitimate warning that you have an analysis factory
 existing to use a managed resource but there is no managed data file
 available.

Yes, you're right, I've merged my schema with the one provided with 4.8.


 : WARN  o.a.s.r.ManagedResource- No stored data found for 
 /rest/managed
 : WARN  o.a.s.r.ManagedResource- No registered observers for 
 /rest/managed
 :
 : How can I get rid of these 2?
 :
 : This jira issue is related https://issues.apache.org/jira/browse/SOLR-6128 .

 I agree, there's no reason i can see for those to be warnings -- so as to
 keep SOLR-6128 focused on just one thing, i've created SOLR-6179 to track
 the ManagedResource WARNs...


 https://issues.apache.org/jira/browse/SOLR-6179

Thanks,
Marius



 -Hoss
 http://www.lucidworks.com/


Re: add new Fields with SolrJ without changing schema.xml

2014-06-19 Thread benjelloun
Hello,

Because i will not stay working with the actual entreprise.
then no one know how to change it manually if they need to add new fields
in the futur.
so i need to do this with java code, can you please help me with an exemple
to complete this:

public static void addNewField(Boolean uniqueId,String type, Boolean
 indexed,Boolean stored,Boolean   multivalued,Boolean
 sortmissinglast,Boolean required){

 .
 .
}

thanks,
Best regards,
Anass BENJELLOUN



2014-06-18 18:21 GMT+02:00 Walter Underwood [via Lucene] 
ml-node+s472066n4142571...@n3.nabble.com:

 Why can't you change schema.xml?  --wunder

 On Jun 18, 2014, at 8:56 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4142571i=0 wrote:

  Hello,
 
  this is what i want to do:
 
  public static void addNewField(Boolean uniqueId,String type, Boolean
  indexed,Boolean stored,Boolean   multivalued,Boolean
  sortmissinglast,Boolean required){
 
  .
  .
 }
 
  any exemple please,
  thanks,
  Best regards,
  Anass BENJELLOUN
 
 
 
 



 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142571.html
  To unsubscribe from add new Fields with SolrJ without changing
 schema.xml, click here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4142515code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTQyNTE1fC0xMDQyNjMzMDgx
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142769.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: add new Fields with SolrJ without changing schema.xml

2014-06-19 Thread Alexandre Rafalovitch
Use dynamic fields definitions perhaps? Just suffix the fields with
_s, _i, etc. As per schema.xml.

You could also use new schemaless mode, but then when they send a
value that auto-creates a field of wrong type, it would be really hard
to troubleshoot.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Thu, Jun 19, 2014 at 2:07 PM, benjelloun anass@gmail.com wrote:
 Hello,

 Because i will not stay working with the actual entreprise.
 then no one know how to change it manually if they need to add new fields
 in the futur.
 so i need to do this with java code, can you please help me with an exemple
 to complete this:

public static void addNewField(Boolean uniqueId,String type, Boolean
 indexed,Boolean stored,Boolean   multivalued,Boolean
 sortmissinglast,Boolean required){

 .
 .
}

 thanks,
 Best regards,
 Anass BENJELLOUN



 2014-06-18 18:21 GMT+02:00 Walter Underwood [via Lucene] 
 ml-node+s472066n4142571...@n3.nabble.com:

 Why can't you change schema.xml?  --wunder

 On Jun 18, 2014, at 8:56 AM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4142571i=0 wrote:

  Hello,
 
  this is what i want to do:
 
  public static void addNewField(Boolean uniqueId,String type, Boolean
  indexed,Boolean stored,Boolean   multivalued,Boolean
  sortmissinglast,Boolean required){
 
  .
  .
 }
 
  any exemple please,
  thanks,
  Best regards,
  Anass BENJELLOUN
 
 
 
 



 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142571.html
  To unsubscribe from add new Fields with SolrJ without changing
 schema.xml, click here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4142515code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTQyNTE1fC0xMDQyNjMzMDgx
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142769.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Store Java object in field and retrieve it in custom function?

2014-06-19 Thread Costi Muraru
Hi,

I'm trying to save a Java object in a binary field and afterwards use this
value in a custom solr function.
I'm able to put and retrieve the Java object in Base64 via the UI, but I
can't seem to be able to retrieve the value in the custom function.

In the function I'm using:
termsIndex = FieldCache.DEFAULT.getTermsIndex(reader, fieldName);
termsIndex.get(doc, spare);
Log.debug(Length:  + spare.length);

The length is always 0. It works well if the field type is not binary, but
string.
Do you have any tips?

Thanks,
Costi


Re: add new Fields with SolrJ without changing schema.xml

2014-06-19 Thread benjelloun
Hello,

I will use DynamicField for some fields but some other fields need to be
created.
this is an exemple:

*Informations*

*unique ID field*

*Type*

*indexed*

*stored*

*multivalued*

*sortmissinglast*

*required*

*Iddocument*

True

long

false

True

False

True

True

so how to add this field? by default the id is indexed and its type is
string
any idea how i can do that without changing manually the shema.xml?


thanks,
 Best regards,
 Anass BENJELLOUN




2014-06-19 9:28 GMT+02:00 Alexandre Rafalovitch [via Lucene] 
ml-node+s472066n4142771...@n3.nabble.com:

 Use dynamic fields definitions perhaps? Just suffix the fields with
 _s, _i, etc. As per schema.xml.

 You could also use new schemaless mode, but then when they send a
 value that auto-creates a field of wrong type, it would be really hard
 to troubleshoot.

 Regards,
Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Thu, Jun 19, 2014 at 2:07 PM, benjelloun [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4142771i=0 wrote:

  Hello,
 
  Because i will not stay working with the actual entreprise.
  then no one know how to change it manually if they need to add new
 fields
  in the futur.
  so i need to do this with java code, can you please help me with an
 exemple
  to complete this:
 
 public static void addNewField(Boolean uniqueId,String type, Boolean
  indexed,Boolean stored,Boolean   multivalued,Boolean
  sortmissinglast,Boolean required){
 
  .
  .
 }
 
  thanks,
  Best regards,
  Anass BENJELLOUN
 
 
 
  2014-06-18 18:21 GMT+02:00 Walter Underwood [via Lucene] 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=4142771i=1:
 
  Why can't you change schema.xml?  --wunder
 
  On Jun 18, 2014, at 8:56 AM, benjelloun [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4142571i=0 wrote:
 
   Hello,
  
   this is what i want to do:
  
   public static void addNewField(Boolean uniqueId,String type, Boolean
   indexed,Boolean stored,Boolean   multivalued,Boolean
   sortmissinglast,Boolean required){
  
   .
   .
  }
  
   any exemple please,
   thanks,
   Best regards,
   Anass BENJELLOUN
  
  
  
  
 
 
 
  --
   If you reply to this email, your message will be added to the
 discussion
  below:
 
 
 http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142571.html
   To unsubscribe from add new Fields with SolrJ without changing
  schema.xml, click here
  

  .
  NAML
  
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

 
 
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142769.html

  Sent from the Solr - User mailing list archive at Nabble.com.


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142771.html
  To unsubscribe from add new Fields with SolrJ without changing
 schema.xml, click here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4142515code=YW5hc3MuYm5qQGdtYWlsLmNvbXw0MTQyNTE1fC0xMDQyNjMzMDgx
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/add-new-Fields-with-SolrJ-without-changing-schema-xml-tp4142515p4142777.html
Sent from the Solr - User mailing list archive at Nabble.com.

Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread RadhaJayalakshmi
Hi,
I am using Solr 4.5.1. In that i have created an Index 114.8 MB. Also i have
the following index  configuration
   indexConfig
maxIndexingThreads8/maxIndexingThreads
ramBufferSizeMB100/ramBufferSizeMB
mergeFactor10/mergeFactor
/indexConfig

I have given the ramBufferSizeMB of 100 and mergefactor of 10. So this
means, that after indexing is completed. i should see =10 segments. Thats
my assumption and even documentation says that.

But, after the indexing is completed, i went into Solr Dashboard, and
selected the collection, for which indexing is completed. It is showing a
Segment count of 13. 
How is this possible? As i have given mergefactor 0f 10, at any point of
time, there should not be more than 9 segments in the index.

I want to understand why 13 segments are created in my index??
Could appreciate if i can get response ASAP

Thanks
Radha




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Segment-Count-of-my-Index-is-greater-than-the-Configured-MergeFactor-tp4142783.html
Sent from the Solr - User mailing list archive at Nabble.com.


making solr to understand English

2014-06-19 Thread Vivekanand Ittigi
Hi,

I'm trying to setup solr that should understand English. For example I've
indexed our company website (www.biginfolabs.com) or it could be any other
website or our own data.

If i put some English like queries i should get the one word answer just
what Google does;queries are:

* Where is India located.
* who is the father of Obama

Workaround:
* Integrated UIMA,Mahout with solr
* I read the book called Taming Text and implemented
https://github.com/tamingtext/book. But Did not get what i want

Can anyone please tell how to move further. It can be anything our team is
ready to do it.

Thanks,
Vivek


Tracing Files Which Have Errors

2014-06-19 Thread Simon Cheng
Hi there,

I have posted 190,000 simple XML using POST.JAR and there are only 8 files
that were with errors. But how do I know which are the ones have errors?

Thank you in advance,
Simon Cheng.


Re: making solr to understand English

2014-06-19 Thread Alexandre Rafalovitch
LoL. That's the several level of abstraction and complication above
what Solr provides.

You are looking at full Natural Language Processing and things like
SemEval (http://en.wikipedia.org/wiki/SemEval ). Or at least
statistical and/or frame-based analysis
(http://en.wikipedia.org/wiki/Frame_language). Plus, it's domain
specific usually, not just point at the website and run.

You may want to start from a PhD (yes, that would be the easy bit).

Or you could look for heavy-duty commercial systems. Again, the
keywords above would be your friends.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Thu, Jun 19, 2014 at 4:42 PM, Vivekanand Ittigi
vi...@biginfolabs.com wrote:
 Hi,

 I'm trying to setup solr that should understand English. For example I've
 indexed our company website (www.biginfolabs.com) or it could be any other
 website or our own data.

 If i put some English like queries i should get the one word answer just
 what Google does;queries are:

 * Where is India located.
 * who is the father of Obama

 Workaround:
 * Integrated UIMA,Mahout with solr
 * I read the book called Taming Text and implemented
 https://github.com/tamingtext/book. But Did not get what i want

 Can anyone please tell how to move further. It can be anything our team is
 ready to do it.

 Thanks,
 Vivek


Re: Tracing Files Which Have Errors

2014-06-19 Thread Alexandre Rafalovitch
How did you post them? Didn't you get an error at some point?

If you are completely stuck, you could probably export the IDs only
back (as a CSV format) and compare to the list of what you sent. Quite
doable.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Thu, Jun 19, 2014 at 4:33 PM, Simon Cheng simonwhch...@gmail.com wrote:
 Hi there,

 I have posted 190,000 simple XML using POST.JAR and there are only 8 files
 that were with errors. But how do I know which are the ones have errors?

 Thank you in advance,
 Simon Cheng.


deep faceting issues in distributed mode

2014-06-19 Thread Dmitry Kan
Hello,

We face an issue with deep faceting in a distributed non-SolrCloud setting.
A query comes in through the solr frontend (router) and broadcasts to each
shard. The exception below appears in the frontend's logs, but shards' logs
are clear, each subquery sent by the router succeeds.

RAM graph looks quite nice for the router, there is plenty of RAM free and
plenty allocated to every shard and the router. So at least I'm not worried
on that side.

The issue is easily eliminated by shortening the date range parameter =
could imply RAM issue, but this theory is not consistent with what we
observe on RAM graph.

Is this like a core known bug or could the issue be investigated / debugged
further?

Solr: 4.3.1
jetty 9

Error response:

response
lst name=responseHeader
int name=status500/int
int name=QTime211/int
lst name=params
str name=facettrue/str
str name=facet.mincount1/str
str name=facet.offset23250/str
str name=q*:*/str
str name=facet.limit750/str
str name=facet.fieldsome_facet_field/str
arr name=fq
str
DateRangeParam:[2014-05-31T21:00:00.000Z TO 2014-06-13T21:00:00.000Z]
/str
strSomeOtherParam:(Value1 OR *)/str
/arr
str name=rows0/str
/lst
/lst
lst name=error
str name=msg
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the
data in not in 'javabin' format
/str
str name=trace
org.apache.solr.common.SolrException: java.lang.RuntimeException: Invalid
version (expected 2, but 60) or the data in not in 'javabin' format at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1486)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:138)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:564)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:213)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1094)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:432)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:175)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1028)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:136)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:258)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:109)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:317)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:445) at
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:267) at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:224)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.run(AbstractConnection.java:358)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532)
at java.lang.Thread.run(Thread.java:722) Caused by:
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the
data in not in 'javabin' format at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109)
at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at
java.util.concurrent.FutureTask.run(FutureTask.java:166) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at
java.util.concurrent.FutureTask.run(FutureTask.java:166) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
... 1 more
/str
int name=code500/int
/lst
/response

-- 
Dmitry Kan
Blog: http://dmitrykan.blogspot.com
Twitter: 

Unable to start solr 4.8

2014-06-19 Thread atp
Hi experts, 

i have cnfigured solrcloud, on three machines , zookeeper started with no
errors, tomcat log also no errors , solr log alos no errors reported but all
the tomcat configured solr clusterstate shows as 'down'



,8870931 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
Updating cloud state from ZooKeeper...
8870934 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
numShards=2 message={
  operation:state,
  state:down,
  base_url:http://10.***.***.28:7090/solr;,
  core:collection1,
  roles:null,
  node_name:10.***.***.28:7090_solr,
  shard:shard2,
  collection:collection1,
  numShards:2,
  core_node_name:10.***.***.28:7090_solr_collection1}
8870939 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
8870942 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â A cluster state change: WatchedEvent state:SyncConnected
type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
(live nodes size: 5)
8919667 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â Updating live nodes... (4)
8933777 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â Updating live nodes... (3)
8965906 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â Updating live nodes... (4)
8965994 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
8965997 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
Updating cloud state from ZooKeeper...
8966000 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
numShards=2 message={
  operation:state,
  state:down,
  base_url:http://10.***.***.29:7070/solr;,
  core:collection1,
  roles:null,
  node_name:10.***.***.29:7070_solr,
  shard:shard1,
  collection:collection1,
  numShards:2,
  core_node_name:110.***.***.29:7070_solr_collection1}
8966006 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
8966008 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â A cluster state change: WatchedEvent state:SyncConnected
type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
(live nodes size: 4)
8986466 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â Updating live nodes... (5)
8986648 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
8986652 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
Updating cloud state from ZooKeeper...
8986654 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
numShards=2 message={
  operation:state,
  state:down,
  base_url:http://10.***.***.30:7080/solr;,
  core:collection1,
  roles:null,
  node_name:10.***.***.30:7080_solr,
  shard:shard1,
  collection:collection1,
  numShards:2,
  core_node_name:10.***.***.30:7080_solr_collection1}
8986661 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
898 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â A cluster state change: WatchedEvent state:SyncConnected
type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
(live nodes size: 5)
9008407 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â Updating live nodes... (6)



when i browse the 28,29 and 30th solr url , its throwing error like, 


HTTP Status 500 - {msg=SolrCore 'collection1' is not available due to init
failure: Index locked for write for core
collection1,trace=org.apache.solr.common.SolrException: SolrCore
'collection1' is not available due to init failure: Index locked for write
for core collection1 at
org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:753) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at

Re: Unable to start solr 4.8

2014-06-19 Thread Markus Jelsma
Hi - remove the lock file in your solr/collection_name/data/index.*/ 
directory.

Markus

On Thursday, June 19, 2014 04:10:51 AM atp wrote:
 Hi experts,
 
 i have cnfigured solrcloud, on three machines , zookeeper started with no
 errors, tomcat log also no errors , solr log alos no errors reported but all
 the tomcat configured solr clusterstate shows as 'down'
 
 
 
 ,8870931 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
 Updating cloud state from ZooKeeper...
 8870934 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
 numShards=2 message={
   operation:state,
   state:down,
   base_url:http://10.***.***.28:7090/solr;,
   core:collection1,
   roles:null,
   node_name:10.***.***.28:7090_solr,
   shard:shard2,
   collection:collection1,
   numShards:2,
   core_node_name:10.***.***.28:7090_solr_collection1}
 8870939 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
 LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
 NodeChildrenChanged
 8870942 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â A cluster state change: WatchedEvent state:SyncConnected
 type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
 (live nodes size: 5)
 8919667 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â Updating live nodes... (4)
 8933777 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â Updating live nodes... (3)
 8965906 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â Updating live nodes... (4)
 8965994 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
 LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
 NodeChildrenChanged
 8965997 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
 Updating cloud state from ZooKeeper...
 8966000 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
 numShards=2 message={
   operation:state,
   state:down,
   base_url:http://10.***.***.29:7070/solr;,
   core:collection1,
   roles:null,
   node_name:10.***.***.29:7070_solr,
   shard:shard1,
   collection:collection1,
   numShards:2,
   core_node_name:110.***.***.29:7070_solr_collection1}
 8966006 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
 LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
 NodeChildrenChanged
 8966008 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â A cluster state change: WatchedEvent state:SyncConnected
 type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
 (live nodes size: 4)
 8986466 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â Updating live nodes... (5)
 8986648 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
 LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
 NodeChildrenChanged
 8986652 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
 Updating cloud state from ZooKeeper...
 8986654 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
 numShards=2 message={
   operation:state,
   state:down,
   base_url:http://10.***.***.30:7080/solr;,
   core:collection1,
   roles:null,
   node_name:10.***.***.30:7080_solr,
   shard:shard1,
   collection:collection1,
   numShards:2,
   core_node_name:10.***.***.30:7080_solr_collection1}
 8986661 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
 LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
 NodeChildrenChanged
 898 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â A cluster state change: WatchedEvent state:SyncConnected
 type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
 (live nodes size: 5)
 9008407 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader
 â Updating live nodes... (6)
 
 
 
 when i browse the 28,29 and 30th solr url , its throwing error like,
 
 
 HTTP Status 500 - {msg=SolrCore 'collection1' is not available due to init
 failure: Index locked for write for core
 collection1,trace=org.apache.solr.common.SolrException: SolrCore
 'collection1' is not available due to init failure: Index locked for write
 for core collection1 at
 org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:753) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
 347) at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
 207) at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
 FilterChain.java:241) at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
 ain.java:208) at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
 va:220) at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
 va:122) at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171
 ) at
 

Re: Unable to start solr 4.8

2014-06-19 Thread atp
Thank you so much Markus , 

I have removed the contents from Index , now its working but one of the node
went Recovering state, the log says,  please help this to make to live. 

â Unable to get file names for indexCommit generation: 2



12118803 [qtp1490747277-11] INFO  org.apache.solr.core.SolrCore  â
[collection1] webapp=/solr path=/replication
params={command=filelistqt=/replicationwt=javabingeneration=2version=2}
status=0 QTime=3
12182820 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
12182824 [Thread-13] INFO  org.apache.solr.common.cloud.ZkStateReader  â
Updating cloud state from ZooKeeper...
12182827 [qtp1490747277-12] INFO 
org.apache.solr.handler.admin.CoreAdminHandler  â Going to wait for
coreNodeName: 10.137.12.247:7080_solr_collection1, state: recovering,
checkLive: true, onlyIfLeader: true
12182828 [Thread-13] INFO  org.apache.solr.cloud.Overseer  â Update state
numShards=2 message={
  operation:state,
  state:recovering,
  base_url:http://10.***.***.29:7080/solr;,
  core:collection1,
  roles:null,
  node_name:10.***.***.29:7080_solr,
  shard:shard1,
  collection:collection1,
  numShards:2,
  core_node_name:10.***.***.29:7080_solr_collection1}
12182834 [main-EventThread] INFO  org.apache.solr.cloud.DistributedQueue  â
LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type
NodeChildrenChanged
12182839 [main-EventThread] INFO  org.apache.solr.common.cloud.ZkStateReader 
â A cluster state change: WatchedEvent state:SyncConnected
type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
(live nodes size: 6)
12182853 [qtp1490747277-12] INFO  org.apache.solr.common.cloud.ZkStateReader 
â Updating cloud state from ZooKeeper...
12182856 [qtp1490747277-12] INFO 
org.apache.solr.handler.admin.CoreAdminHandler  â Will wait a max of 183
seconds to see collection1 (shard1 of collection1) have state: recovering
12182856 [qtp1490747277-12] INFO 
org.apache.solr.handler.admin.CoreAdminHandler  â Waited coreNodeName:
10.137.12.247:7080_solr_collection1, state: recovering, checkLive: true,
onlyIfLeader: true for: 0 seconds.
12182858 [qtp1490747277-12] INFO  org.apache.solr.servlet.SolrDispatchFilter 
â [admin] webapp=null path=/admin/cores
params={coreNodeName=10.137.12.247:7080_solr_collection1onlyIfLeaderActive=truestate=recoveringnodeName=10.137.12.247:7080_solraction=PREPRECOVERYcheckLive=truecore=collection1wt=javabinonlyIfLeader=trueversion=2}
status=0 QTime=31
12184865 [qtp1490747277-19] INFO  org.apache.solr.update.UpdateHandler  â
start
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
12184865 [qtp1490747277-19] INFO  org.apache.solr.update.UpdateHandler  â No
uncommitted changes. Skipping IW.commit.
12184867 [qtp1490747277-19] INFO  org.apache.solr.update.UpdateHandler  â
end_commit_flush
12184867 [qtp1490747277-19] INFO 
org.apache.solr.update.processor.LogUpdateProcessor  â [collection1]
webapp=/solr path=/update
params={waitSearcher=trueopenSearcher=falsecommit=truewt=javabincommit_end_point=trueversion=2softCommit=false}
{commit=} 0 3
12184873 [qtp1490747277-18] INFO  org.apache.solr.core.SolrCore  â
[collection1] webapp=/solr path=/replication
params={command=indexversionqt=/replicationwt=javabinversion=2} status=0
QTime=1
12184878 [qtp1490747277-18] ERROR org.apache.solr.handler.ReplicationHandler 
â Unable to get file names for indexCommit generation: 2
java.io.FileNotFoundException: _0.fnm
at
org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:260)
at
org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:177)
at
org.apache.solr.handler.ReplicationHandler.getFileList(ReplicationHandler.java:421)
at
org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:209)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)

Thanks ,
ATP




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-start-solr-4-8-tp4142810p4142816.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unable to start solr 4.8

2014-06-19 Thread atp
Hi Markus ,

It been recoverd automatically after several attempts ,

once again thanks a lot for your help. 

Now all the nodes are became live. 


[zk: Hadoop-Main:7001(CONNECTED) 0] get /clusterstate.json
{collection1:{
shards:{
  shard1:{
range:8000-,
state:active,
replicas:{
  core_node1:{
state:active,
base_url:http://10.***.***.28:8983/solr;,
core:collection1,
node_name:10.***.***.28:8983_solr,
leader:true},
  core_node3:{
state:active,
base_url:http://10.***.***.30:8983/solr;,
core:collection1,
node_name:10.***.***.30:8983_solr},
  10.***.***.28:7070_solr_collection1:{
state:active,
base_url:http://10.***.***.28:7070/solr;,
core:collection1,
node_name:10.***.***.28:7070_solr},
  10.***.***.29:7080_solr_collection1:{
state:active,
base_url:http://10.***.***.29:7080/solr;,
core:collection1,
node_name:10.***.***.29:7080_solr}}},
  shard2:{
range:0-7fff,
state:active,
replicas:{
  core_node2:{
state:active,
base_url:http://10.***.***.29:8983/solr;,
core:collection1,
node_name:10.***.***.29:8983_solr,
leader:true},
  10.***.***.30:7090_solr_collection1:{
state:active,
base_url:http://10.***.***.30:7090/solr;,
core:collection1,
node_name:10.***.***.30:7090_solr,
maxShardsPerNode:1,
router:{name:compositeId},
replicationFactor:1,
autoCreated:true}}



Regards,
ATP.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-start-solr-4-8-tp4142810p4142822.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread Shawn Heisey
On 6/19/2014 2:51 AM, RadhaJayalakshmi wrote:
 I am using Solr 4.5.1. In that i have created an Index 114.8 MB. Also i have
 the following index  configuration
indexConfig
   maxIndexingThreads8/maxIndexingThreads
   ramBufferSizeMB100/ramBufferSizeMB
   mergeFactor10/mergeFactor
   /indexConfig
 
 I have given the ramBufferSizeMB of 100 and mergefactor of 10. So this
 means, that after indexing is completed. i should see =10 segments. Thats
 my assumption and even documentation says that.
 
 But, after the indexing is completed, i went into Solr Dashboard, and
 selected the collection, for which indexing is completed. It is showing a
 Segment count of 13. 
 How is this possible? As i have given mergefactor 0f 10, at any point of
 time, there should not be more than 9 segments in the index.
 
 I want to understand why 13 segments are created in my index??
 Could appreciate if i can get response ASAP

Imagine the following scenario.  You start from a clean index and do
enough indexing to create ten little segments.  At that point, Solr will
merge these segments into one large segment.  Let's say that now you do
enough indexing to create ten more segments.  It won't do the merge when
you reach nine little segments and one large segment ... it will do the
merge when you have ten little segments.  When the merge is done, you'll
be left with two large segments.  If you do enough indexing now to
create twenty new segments, then at the end you'll be left with four
large segments.  After this, if you index nine new segments, you've got
thirteen segments in your index and it won't do any more merging until
another segment is created.

Additional merge levels exist.  When you reach ten large segments, Solr
will merge those into one huge segment.  If indexing continues long
enough to create ten huge segments, they will be merged into one
enormous segment.  It would be possible to have a stable index with 9
segments at each of the levels that I have mentioned -- 36 segments.
The merge policy that Solr uses by default will continue creating
additional merge levels until the segments at the highest reach at least
five gigabytes in size -- nothing larger will be created unless you
optimize the index.

My effective merge factor is 35.  I have personally witnessed stable
indexes on my system with 80 or 90 segments.

Thanks,
Shawn



Re: Calculating filterCache size

2014-06-19 Thread Erick Erickson
Ben:

As Shawn says, you're on the right track...

Do note, though, that a 10K size here is probably excessive, YMMV of course.

And an autowarm count of 5,000 is almost _certainly_ far more than you
want. All these fq clauses get re-executed whenever a new searcher is
opened (soft commit or hard commit with openSearcher=true). I realize
this may just be illustrative. Is this your actual setup? And if so,
what is your motivation for 5,000 autowarm count?

Best,
Erick

On Wed, Jun 18, 2014 at 11:42 AM, Shawn Heisey s...@elyograg.org wrote:
 On 6/18/2014 10:57 AM, Benjamin Wiens wrote:
 Thanks Erick!
 So let's say I have a config of

 filterCache
 class=solr.FastLRUCache
 size=1
 initialSize=1
 autowarmCount=5000/

 MaxDocuments = 1,000,000

 So according to your formula, filterCache should roughly have the potential
 to consume this much RAM:
 ((1,000,000 / 8) + 128) * (10,000) = 1,251,280,000 byte / 1,000 =
 1,251,280 kb / 1,000 = 1,251.28 mb / 1000 = 1.25 gb

 Yes, this is essentially correct.  If you want to arrive at a number
 that's more accurate for the way that OS tools will report memory,
 you'll divide by 1024 instead of 1000 for each of the larger units.
 That results in a size of 1.16GB instead of 1.25.  Computers think in
 powers of 2, dividing by 1000 assumes a bias to how people think, in
 powers of 10.  It's the same thing that causes your computer to report
 931GB for a 1TB hard drive.

 Thanks,
 Shawn



Re: Synonyms - 20th and 20

2014-06-19 Thread Erick Erickson
You almost certainly have WordDelimiterFilterFactory in your analysis
chain after the synonym insertion. It's _job_ is to split on
letter/non-letter transitions.

The admin/analysis page is your friend.

Best,
Erick

On Wed, Jun 18, 2014 at 12:47 PM, Diego Fernandez difer...@redhat.com wrote:
 What tokenizer and filters are you using?

 Diego Fernandez - 爱国
 Software Engineer
 US GSS Supportability - Diagnostics


 - Original Message -
 I have a synonyms.txt file which has
 20th,twentieth

 Once I apply the synonym, I see 20th, twentieth and 20 for 20th.
 Does anyone know where 20 comes from? How can I have only 20th and
 twentieth?

 Thanks,

 Jae



Re: Bug in Collapsing QParserPlugin : Sort by 3 or more fields is broken

2014-06-19 Thread Joel Bernstein
Umesh, this is a good summary.

So, the question is what is the cost (performance and memory) of having the
CollapsingQParserPlugin choose the group head by using the Solr sort
criteria?

Keep in mind that the CollapsingQParserPlugin's main design goal is to
provide fast performance when collapsing on a high cardinality field. How
you choose the group head can have a big impact here, both on memory
consumption performance.

The function query collapse criteria was added to allow you to come up with
custom formulas for selecting the group head, with little or no impact on
performance and memory. Using Solr's recip() function query it seems like
you could come up with some nice scenarios where two variables could be
used to select the group head. For example:

fq={!collapse field=a max='sub(prod(cscore(),1000), recip(field(x),1, 1000,
1000))'}

This seems like it would basically give you two sort critea: cscore(),
which returns the score, would be the primary criteria. The recip of field
x would be the secondary criteria.













Joel Bernstein
Search Engineer at Heliosearch


On Thu, Jun 19, 2014 at 2:18 AM, Umesh Prasad umesh.i...@gmail.com wrote:

 Continuing the discussion on mailing list from Jira.

 An Example


 *id  group   f1  f2*1   g1
 5   10
 2   g1 5   1000
 3   g1 5   1000
 4   g1 10  100
 5   g2 5   10
 6   g2 5   1000
 7   g2 5   1000
 8   g210  100

 sort= f1 asc, f2 desc , id desc


 *Without collapse will give : *
 (7,g2), (6,g2),  (3,g1), (2,g1), (5,g2), (1,g1), (8,g2), (4,g1)


 *On collapsing by group_s  expected output is : *  (7,g2), (3,g1)

 solr standard collapsing does give this output  with
 group=on,group.field=group_s,group.main=true

 * Collapsing with CollapsingQParserPlugin* fq={!collapse field=group_s} :
   (5,g2), (1,g1)



 * Summarizing Jira Discussion :*
 1. CollapsingQParserPlugin picks up the group heads from matching results
 and passes those further. So in essence filtering some of the matching
 documents, so that subsequent collectors never see them. It can also pass
 on score to subsequent collectors using a dummy scorer.

 2. TopDocCollector comes later in hierarchy and it will sort on the
 collapsed set. That works fine.

 The issue is with step 1. Collapsing is done by a single comparator which
 can take its value from a field or function. It defaults to score.
 Function queries do allow us to combine multiple fields / value sources,
 however it would be difficult to construct a function for given sort
 fields. Primarily because
 a) The range of values for a given sort field is not known in advance.
 It is possible for one sort field to unbounded, but other to be bounded
 within a small range.
 b) The sort field can itself hold custom logic.

 Because of (a) the group head selected by CollapsingQParserPlugin will be
 incorrect and subsequent sorting will break.



 On 14 June 2014 12:38, Umesh Prasad umesh.i...@gmail.com wrote:

 Thanks Joel for the quick response. I have opened a new jira ticket.

 https://issues.apache.org/jira/browse/SOLR-6168




 On 13 June 2014 17:45, Joel Bernstein joels...@gmail.com wrote:

 Let's open a new ticket.

 Joel Bernstein
 Search Engineer at Heliosearch


 On Fri, Jun 13, 2014 at 8:08 AM, Umesh Prasad umesh.i...@gmail.com
 wrote:

  The patch in SOLR-5408 fixes the issue with sorting only for two sort
  fields. Sorting still breaks when 3 or more sort fields are used.
 
  I have attached a test case, which demonstrates the broken behavior
 when 3
  sort fields are used.
 
  The failing test case patch is against Lucene/Solr 4.7 revision  number
  1602388
 
  Can someone apply and verify the bug ?
 
  Also, should I re-open SOLR-5408  or open a new ticket ?
 
 
  ---
  Thanks  Regards
  Umesh Prasad
 




 --
 ---
 Thanks  Regards
 Umesh Prasad




 --
 ---
 Thanks  Regards
 Umesh Prasad



Query Response in Html

2014-06-19 Thread Venkata krishna
Hi,

I am using XSLResponseWriter on my application for to transform xml response
into html.The following params i have set for that purpose.
 
query.set(wt, xslt);
query.set(indent,true);
query.set(tr, example.xsl);

but the response is coming as normal text.Even though i remove the params
the response is coming same as  previous with out any change.

I have also tried with  velocity Response writer also by setting the
following params.
query.set(wt, velocity);
query.set(v.template,browse);
query.set(v.layout, layout);

then also i am getting same response as normal text.

I would like to get html response.

So could you please provide any solution.


Thanks,

Venkata Krishna Tolusuri.
 











--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-Response-in-Html-tp4142838.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Calculating filterCache size

2014-06-19 Thread Benjamin Wiens
Thanks to both of you. Yes the mentioned config is illustrative, we decided
for 512 after thorough testing. However, when you google Solr filterCache
the first link is the community wiki which has a config even higher than
the illustration which is quite different from the official reference
guide. It might be a good idea to change this unless there's a very small
index.

http://wiki.apache.org/solr/SolrCaching#filterCache

filterCache  class=solr.LRUCache  size=16384
initialSize=4096  autowarmCount=4096/






On Thu, Jun 19, 2014 at 9:48 AM, Erick Erickson erickerick...@gmail.com
wrote:

 Ben:

 As Shawn says, you're on the right track...

 Do note, though, that a 10K size here is probably excessive, YMMV of
 course.

 And an autowarm count of 5,000 is almost _certainly_ far more than you
 want. All these fq clauses get re-executed whenever a new searcher is
 opened (soft commit or hard commit with openSearcher=true). I realize
 this may just be illustrative. Is this your actual setup? And if so,
 what is your motivation for 5,000 autowarm count?

 Best,
 Erick

 On Wed, Jun 18, 2014 at 11:42 AM, Shawn Heisey s...@elyograg.org wrote:
  On 6/18/2014 10:57 AM, Benjamin Wiens wrote:
  Thanks Erick!
  So let's say I have a config of
 
  filterCache
  class=solr.FastLRUCache
  size=1
  initialSize=1
  autowarmCount=5000/
 
  MaxDocuments = 1,000,000
 
  So according to your formula, filterCache should roughly have the
 potential
  to consume this much RAM:
  ((1,000,000 / 8) + 128) * (10,000) = 1,251,280,000 byte / 1,000 =
  1,251,280 kb / 1,000 = 1,251.28 mb / 1000 = 1.25 gb
 
  Yes, this is essentially correct.  If you want to arrive at a number
  that's more accurate for the way that OS tools will report memory,
  you'll divide by 1024 instead of 1000 for each of the larger units.
  That results in a size of 1.16GB instead of 1.25.  Computers think in
  powers of 2, dividing by 1000 assumes a bias to how people think, in
  powers of 10.  It's the same thing that causes your computer to report
  931GB for a 1TB hard drive.
 
  Thanks,
  Shawn
 



Re: Limit Porter stemmer to plural stemming only?

2014-06-19 Thread jerry.ja...@gmail.com
Hi,

Do you mind attaching the Plural only Stemmer? I cant find it in this post.

Thanks
Jerry



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Limit-Porter-stemmer-to-plural-stemming-only-tp486449p4142867.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Why aren't my nested documents nesting?

2014-06-19 Thread Vinay B,
Thanks,
I tried the block join query via the browser this morning with no success;
My URL ( encoded of course). I used this as a guide
https://cwiki.apache.org/confluence/display/solr/Other+Parsers


http://localhost:8088/solr/test_core/select?q={!parent
which=content_type:parentDocument}ATTRIBUTES.STATE:TXwt=jsonindent=true

(
equivalent to
http://localhost:8088/solr/test_core/select?q=%7b!parent+which%3d%22content_type%3aparentDocument%22%7dATTRIBUTES.STATE%3aTX%26wt%3djson%26indent%3dtrue
)

Resulting in
response
lst name=responseHeader
int name=status0/int
int name=QTime1/int
lst name=params
str name=q
{!parent
which=content_type:parentDocument}ATTRIBUTES.STATE:TXwt=jsonindent=true
/str
/lst
/lst
result name=response numFound=0 start=0/
/response



On Wed, Jun 18, 2014 at 11:30 PM, Mikhail Khludnev 
mkhlud...@griddynamics.com wrote:

 because you need you query by special query parser
 http://blog.griddynamics.com/2013/09/solr-block-join-support.html
 to nest the output you need
 https://issues.apache.org/jira/browse/SOLR-5285


 On Thu, Jun 19, 2014 at 3:20 AM, Vinay B, vybe3...@gmail.com wrote:

  Probably a silly error. Can someone point out my mistake? Code and output
  gists at https://gist.github.com/anonymous/fb9cdb5b44e76b2c308d
 
  Thanks
 
  Code:
  SolrInputDocument solrDoc = new SolrInputDocument();
  solrDoc.addField(id, documentId);
  solrDoc.addField(content_type, parentDocument);
  solrDoc.addField(Constants.REMOTE_FILE_PATH, filePath == null ?
 
  : filePath);
  solrDoc.addField(Constants.REMOTE_FILE_LOAD, Constants.TRUE);
 
  SolrInputDocument childDoc = new SolrInputDocument();
  childDoc.addField(Constants.ID, documentId+-A);
  childDoc.addField(ATTRIBUTES.STATE, LA);
  childDoc.addField(ATTRIBUTES.STATE, TX);
  solrDoc.addChildDocument(childDoc);
 
  solrServer.add(solrDoc);
  solrServer.commit();
 



 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com



clarification on index-to-ram ratio

2014-06-19 Thread Vinay Pothnis
Hello All,

The documentation and general feedback on the mailing list suggest the
following:

*... Let's say that you have a Solr index size of 8GB. If your OS, Solr's
Java heap, and all other running programs require 4GB of memory, then
an ideal memory size for that server is at least 12GB ...*

http://wiki.apache.org/solr/SolrPerformanceProblems#General_information

So, when we say index size does it include ALL the replicas or just one
of the replica? Say for example, if the solr instance had 2 replicas each
of size 8GB, should we consider 16GB as our index size or just 8GB - for
the above index-ram-ratio consideration?

Thanks
Vinay


Re: Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread Chris Hostetter

:  I want to understand why 13 segments are created in my index??
:  Could appreciate if i can get response ASAP

: Imagine the following scenario.  You start from a clean index and do

FWIW: the TL;DR of Shawn's response can be seen in this animation of how 
Log based MergePolicy's work in the simplest scenerios...

https://www.youtube.com/watch?v=YW0bOvLp72E

More animations of other scnerioes and other MergePolicies can be found 
here...

http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html


-Hoss
http://www.lucidworks.com/


RE: clarification on index-to-ram ratio

2014-06-19 Thread Toke Eskildsen
Vinay Pothnis [poth...@gmail.com] wrote:
 *... Let's say that you have a Solr index size of 8GB. If your OS, Solr's
 Java heap, and all other running programs require 4GB of memory, then
 an ideal memory size for that server is at least 12GB ...*

 So, when we say index size does it include ALL the replicas or just one
 of the replica? Say for example, if the solr instance had 2 replicas each
 of size 8GB, should we consider 16GB as our index size or just 8GB - for
 the above index-ram-ratio consideration?

16GB, according to the above principle. Enough RAM to hold all index data on 
storage.

Two things though,

1) If you have replicas of the same data on the same machine, I hope that you 
have them on separate physical drives. If not, it is just wasted disk cache 
with no benefits.

2) The general advice is only really usable when we're either talking fairly 
small indexes on spinning drives or there is a strong need for the absolute 
lowest latency possible. As soon as we scale up and do not have copious amounts 
of money, solid state drives provides much better bang for the buck than a 
spinning drives + RAM combination.

- Toke Eskildsen


Re: clarification on index-to-ram ratio

2014-06-19 Thread Vinay Pothnis
Thanks!
And yes, the replica belongs to a different shard - not the same data.

-Vinay


On 19 June 2014 11:21, Toke Eskildsen t...@statsbiblioteket.dk wrote:

 Vinay Pothnis [poth...@gmail.com] wrote:
  *... Let's say that you have a Solr index size of 8GB. If your OS,
 Solr's
  Java heap, and all other running programs require 4GB of memory, then
  an ideal memory size for that server is at least 12GB ...*

  So, when we say index size does it include ALL the replicas or just one
  of the replica? Say for example, if the solr instance had 2 replicas each
  of size 8GB, should we consider 16GB as our index size or just 8GB - for
  the above index-ram-ratio consideration?

 16GB, according to the above principle. Enough RAM to hold all index data
 on storage.

 Two things though,

 1) If you have replicas of the same data on the same machine, I hope that
 you have them on separate physical drives. If not, it is just wasted disk
 cache with no benefits.

 2) The general advice is only really usable when we're either talking
 fairly small indexes on spinning drives or there is a strong need for the
 absolute lowest latency possible. As soon as we scale up and do not have
 copious amounts of money, solid state drives provides much better bang for
 the buck than a spinning drives + RAM combination.

 - Toke Eskildsen



Re: Query Response in Html

2014-06-19 Thread Erik Hatcher
Show us the complete code you’re using.   Is the “text” not HTML text?   What 
are you receiving exactly and what you expecting instead?

To use SolrJ with other types of responses (non-XML/javabin) you’ll need to 
configure a ResponseParser.  The NoOpResponseParser may do the trick, where you 
get back the text (though it would be HTML text) in the “response” key of the 
NamedList returned.

Erik

On Jun 19, 2014, at 10:27 AM, Venkata krishna venkat1...@gmail.com wrote:

 Hi,
 
 I am using XSLResponseWriter on my application for to transform xml response
 into html.The following params i have set for that purpose.
 
 query.set(wt, xslt);
 query.set(indent,true);
 query.set(tr, example.xsl);
 
 but the response is coming as normal text.Even though i remove the params
 the response is coming same as  previous with out any change.
 
 I have also tried with  velocity Response writer also by setting the
 following params.
 query.set(wt, velocity);
 query.set(v.template,browse);
 query.set(v.layout, layout);
 
 then also i am getting same response as normal text.
 
 I would like to get html response.
 
 So could you please provide any solution.
 
 
 Thanks,
 
 Venkata Krishna Tolusuri.
 
 
 
 
 
 
 
 
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Query-Response-in-Html-tp4142838.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Calculating filterCache size

2014-06-19 Thread Erick Erickson
That's specific to using the facet.method=enum, but do admit it's easy
to miss that.

I added a note about that though...

Thanks for pointing that out!


On Thu, Jun 19, 2014 at 9:38 AM, Benjamin Wiens
benjamin.wi...@gmail.com wrote:
 Thanks to both of you. Yes the mentioned config is illustrative, we decided
 for 512 after thorough testing. However, when you google Solr filterCache
 the first link is the community wiki which has a config even higher than
 the illustration which is quite different from the official reference
 guide. It might be a good idea to change this unless there's a very small
 index.

 http://wiki.apache.org/solr/SolrCaching#filterCache

 filterCache  class=solr.LRUCache  size=16384
 initialSize=4096  autowarmCount=4096/






 On Thu, Jun 19, 2014 at 9:48 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 Ben:

 As Shawn says, you're on the right track...

 Do note, though, that a 10K size here is probably excessive, YMMV of
 course.

 And an autowarm count of 5,000 is almost _certainly_ far more than you
 want. All these fq clauses get re-executed whenever a new searcher is
 opened (soft commit or hard commit with openSearcher=true). I realize
 this may just be illustrative. Is this your actual setup? And if so,
 what is your motivation for 5,000 autowarm count?

 Best,
 Erick

 On Wed, Jun 18, 2014 at 11:42 AM, Shawn Heisey s...@elyograg.org wrote:
  On 6/18/2014 10:57 AM, Benjamin Wiens wrote:
  Thanks Erick!
  So let's say I have a config of
 
  filterCache
  class=solr.FastLRUCache
  size=1
  initialSize=1
  autowarmCount=5000/
 
  MaxDocuments = 1,000,000
 
  So according to your formula, filterCache should roughly have the
 potential
  to consume this much RAM:
  ((1,000,000 / 8) + 128) * (10,000) = 1,251,280,000 byte / 1,000 =
  1,251,280 kb / 1,000 = 1,251.28 mb / 1000 = 1.25 gb
 
  Yes, this is essentially correct.  If you want to arrive at a number
  that's more accurate for the way that OS tools will report memory,
  you'll divide by 1024 instead of 1000 for each of the larger units.
  That results in a size of 1.16GB instead of 1.25.  Computers think in
  powers of 2, dividing by 1000 assumes a bias to how people think, in
  powers of 10.  It's the same thing that causes your computer to report
  931GB for a 1TB hard drive.
 
  Thanks,
  Shawn
 



[ANN] Heliosearch 0.06 released, native code faceting

2014-06-19 Thread Yonik Seeley
FYI, for those who want to try out the new native code faceting, this
is the first release containing it (for single valued string fields
only as of yet).

http://heliosearch.org/download/

Heliosearch v0.06

Features:
o  Heliosearch v0.06 is based on (and contains all features of)
Lucene/Solr 4.9.0
o  Native code faceting for single valued string fields.
- Written in C++, statically compiled with gcc for Windows, Mac OS-X, Linux
- static compilation avoids JVM hotspot warmup period,
mis-compilation bugs, and variations between runs
- Improves performance over 2x
o  Top level Off-heap fieldcache for single valued string fields in nCache.
- Improves sorting and faceting speed
- Reduces garbage collection overhead
- Eliminates FieldCache “insanity” that exists in Apache Solr from
faceting and sorting on the same field
o  Full request Parameter substitution / macro expansion, including
default value support.
o  frange query now only returns documents with a value.
 For example, in Apache Solr, {!frange l=-1 u=1 v=myfield} will
also return documents without a value since the numeric default value
of 0 lies within the range requested.
o  New JSON features via Noggit upgrade, allowing optional comments
(C/C++ and shell style), unquoted keys, and relaxed escaping that
allows one to backslash escape any character.


-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data


Re: Cursor deep paging new behavior

2014-06-19 Thread Chris Hostetter

if by old behavior you mean incremending the start param then the the 
order of results when doing concurrent indexing was always dependent on 
what exactly your sort was.

when using a cursor, the impacts of concurrent indexing are also dependent 
on what your sort clause looks like -- but in differnet ways.

both situations are extensively documented...

https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results#PaginationofResults-HowBasicPaginationisAffectedbyIndexUpdates
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results#PaginationofResults-HowcursorsareAffectedbyIndexUpdates


: I have a quick question about this new implementation - in the old 
: implementation AFAIK, in a real-time indexing scenario, the results 
: gathered from paging would not be consecutive. Meaning you would ask for 
: 50 docs, new docs arrive, when you ask for the next 50 docs - you get an 
: arbitrary new document set (50 after any newly inserted docs).
: 
: Having read a bit about the cursor implementation, is it true that the 
: next 50 results are now consecutive to the first set due to the fact 
: that lucene actually tracks the mark?


-Hoss
http://www.lucidworks.com/


Re: Limit Porter stemmer to plural stemming only?

2014-06-19 Thread Chris Hostetter

: Can you please share the Java code for Plural Only Porter Stemmer for English 
if you don't mind?

The Porter stemmer algorithm, by definition, does more then just stip 
plurals.

If you are interested in a lighter weight stemmer for english, this is 
exactly what the EnglishMinimalStemFilterFactory is for...

https://lucene.apache.org/core/4_8_0/analyzers-common/org/apache/lucene/analysis/en/EnglishMinimalStemFilterFactory.html

Although you may also be interested in combining it with the 
EnglishPossessiveFilterFactory...

https://lucene.apache.org/core/4_8_0/analyzers-common/org/apache/lucene/analysis/en/EnglishPossessiveFilterFactory.html



-Hoss
http://www.lucidworks.com/


Re: Multivalue wild card search

2014-06-19 Thread Ethan
Ahmet,

 Assuming there is a multiValued field called Name of type string stored
in index -

//Doc 1
id : 23512
HotelId : [
12,
23,
12
]
Name : [
[[\Ethan\, \G\, \\],[\Steve\, \Wonder\, \\]],
[],
[[\hifte\, \Grop\, \\]]
]

// Doc 2

id : 23513
HotelId : [
12,
12
]
Name : [
[[\Ethan\, \G\, \\],[\Steve\, \\, \\]],
[],
]

Here, how do I find the document with Name that contains Steve Wonder?

I tried q=***[\Steve\, \Wonder\, \\]] but that doesn't work.

On Fri, Jun 6, 2014 at 11:10 AM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi Ethan,


 It is hard to understand your example. Can you re-write it? Using xml?



 On Friday, June 6, 2014 9:07 PM, Ethan eh198...@gmail.com wrote:
 Bumping the thread to see if anyone has a solution.





 On Thu, Jun 5, 2014 at 9:52 AM, Ethan eh198...@gmail.com wrote:

  Wildcard search do work on multiValued field.  I was able to pull up
  records for following multiValued field -
 
  Code : [
  12344,
  4534,
  674
  ]
 
  q=Code:45* fetched the correct document.  It doesn't work in
  quotes(q=Code:45*), however.  Is there a workaround?
 
 
  On Thu, Jun 5, 2014 at 9:34 AM, Ethan eh198...@gmail.com wrote:
 
  Are you implying there is not way to lookup on a multiValued field with
 a
  substring?  If so, then how is it usually handled?
 
 
  On Wed, Jun 4, 2014 at 4:44 PM, Jack Krupansky j...@basetechnology.com
 
  wrote:
 
  Wildcard, fuzzy, and regex query operate on a single term of a single
  tokenized field value or a single string field value.
 
  -- Jack Krupansky
 
  -Original Message- From: Ethan
  Sent: Wednesday, June 4, 2014 6:59 PM
  To: solr-user
  Subject: Multivalue wild card search
 
 
  I can't seem to find a solution to do wild card search on a multiValued
  field.
 
  For Eg consider a multiValued field called Name with 3 values -
 
  Name : [
  [[\Ethan\, \G\, \\],[\Steve\, \Wonder\, \\]],
  [],
  [[\hifte\, \Grop\, \\]]
  ]
 
  For a multiValued like above, I want search like-
 
  q=***[\Steve\, \Wonder\, \\]
 
 
  But I do not get back any results back. Any ideas on to create such
  query?
 
 
 
 




Indexing a term into separate Lucene indexes

2014-06-19 Thread Huang, Roger
If I have documents with a person and his email address: 
u...@domain.commailto:u...@domain.com

How can I configure Solr (4.6) so that the email address source field is 
indexed as

-  the user part of the address (e.g., user) is in Lucene index X

-  the domain part of the address (e.g., domain.com) is in a separate 
Lucene index Y

I would like to be able search as follows:

-  Find all people whose email addresses have user part = userXyz

-  Find all people whose email addresses have domain part = 
domainABC.com

-  Find the person with exact email address = user...@domainabc.com

Would I use a copyField declaration in my schema?
http://wiki.apache.org/solr/SchemaXml#Copy_Fields

Thanks!


Fwd: Tracing Files Which Have Errors

2014-06-19 Thread Simon Cheng
Hi there,

I have posted 190,000 simple XML using POST.JAR and there are only 8 files
that were with errors. But how do I know which are the ones have errors?

Thank you in advance,
Simon Cheng.


running Post jar from different server

2014-06-19 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi,  I have situation where my SQL Job initiate a console application , where I 
am calling the post.jar to upload data to SOLR. Both SQL DB and SOLR are 2 
different servers.

I am calling post.jar from my SQLDB where the path is mapped to a network 
drive. I am getting an error file not found.

Is the above scenario is possible, if anyone has some experience on this can 
you share or any direction will be really appreciated.

Thanks

Ravi


Re: running Post jar from different server

2014-06-19 Thread Sameer Maggon
Ravi,

post.jar is a standalone utility that does not have to be on the same
server. If you can share the command you are executing, there might be some
pointers in there.

Thanks,
-- 
*Sameer Maggon*
http://measuredsearch.com


On Thu, Jun 19, 2014 at 8:54 PM, EXTERNAL Taminidi Ravi (ETI,
Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote:

 Hi,  I have situation where my SQL Job initiate a console application ,
 where I am calling the post.jar to upload data to SOLR. Both SQL DB and
 SOLR are 2 different servers.

 I am calling post.jar from my SQLDB where the path is mapped to a network
 drive. I am getting an error file not found.

 Is the above scenario is possible, if anyone has some experience on this
 can you share or any direction will be really appreciated.

 Thanks

 Ravi



[ANN][Meta] Apache Solr popularizers LinkedIn group

2014-06-19 Thread Alexandre Rafalovitch
Hello,

( TL;DR: http://www.linkedin.com/groups?gid=6713853 )

Based on - short :-] - Twitter discussion,  we have decided to have a
go at some sort of a group for Solr popularizers: people who are
teaching Solr, running meetups, building Solr examples, and writing
Solr books.

Basically, anybody whose goal is not just to learn Solr for themselves
but to also spread the sunny goodness message to others.

The current attempt at this conversation space is a private LinkedIn
group. Which obviously has some advantages and disadvantages. I
thought about Google and Yahoo groups, but feel they have even more
disadvantages. The group is private (for now) to see if it will foster
more frank discussion and maybe things like early slide sharing and
work-in-progress.

So, if you are popularizing Solr and feel you could benefit from a
community of other people struggling with the same meta-level
explanation issues, this is for you. Come and help us build that
community: http://www.linkedin.com/groups?gid=6713853

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


Re: Multivalue wild card search

2014-06-19 Thread Jack Krupansky

1. Wildcards do not work within quoted terms.
2. Spaces in terms need to be escaped.
3. The quotes embedded in a term do not need to be escaped.

So, try:

q=*[Steve,\ Wonder,\ ]]

or

q=*[Steve,\ Wonder,\ ]*

-- Jack Krupansky

-Original Message- 
From: Ethan

Sent: Thursday, June 19, 2014 5:16 PM
To: solr-user ; Ahmet Arslan
Subject: Re: Multivalue wild card search

Ahmet,

Assuming there is a multiValued field called Name of type string stored
in index -

//Doc 1
id : 23512
HotelId : [
   12,
   23,
   12
]
Name : [
[[\Ethan\, \G\, \\],[\Steve\, \Wonder\, \\]],
[],
[[\hifte\, \Grop\, \\]]
]

// Doc 2

id : 23513
HotelId : [
   12,
   12
]
Name : [
[[\Ethan\, \G\, \\],[\Steve\, \\, \\]],
[],
]

Here, how do I find the document with Name that contains Steve Wonder?

I tried q=***[\Steve\, \Wonder\, \\]] but that doesn't work.

On Fri, Jun 6, 2014 at 11:10 AM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:


Hi Ethan,


It is hard to understand your example. Can you re-write it? Using xml?



On Friday, June 6, 2014 9:07 PM, Ethan eh198...@gmail.com wrote:
Bumping the thread to see if anyone has a solution.





On Thu, Jun 5, 2014 at 9:52 AM, Ethan eh198...@gmail.com wrote:

 Wildcard search do work on multiValued field.  I was able to pull up
 records for following multiValued field -

 Code : [
 12344,
 4534,
 674
 ]

 q=Code:45* fetched the correct document.  It doesn't work in
 quotes(q=Code:45*), however.  Is there a workaround?


 On Thu, Jun 5, 2014 at 9:34 AM, Ethan eh198...@gmail.com wrote:

 Are you implying there is not way to lookup on a multiValued field with
a
 substring?  If so, then how is it usually handled?


 On Wed, Jun 4, 2014 at 4:44 PM, Jack Krupansky j...@basetechnology.com

 wrote:

 Wildcard, fuzzy, and regex query operate on a single term of a single
 tokenized field value or a single string field value.

 -- Jack Krupansky

 -Original Message- From: Ethan
 Sent: Wednesday, June 4, 2014 6:59 PM
 To: solr-user
 Subject: Multivalue wild card search


 I can't seem to find a solution to do wild card search on a 
 multiValued

 field.

 For Eg consider a multiValued field called Name with 3 values -

 Name : [
 [[\Ethan\, \G\, \\],[\Steve\, \Wonder\, \\]],
 [],
 [[\hifte\, \Grop\, \\]]
 ]

 For a multiValued like above, I want search like-

 q=***[\Steve\, \Wonder\, \\]


 But I do not get back any results back. Any ideas on to create such
 query?










Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-19 Thread Andy
Congrats! Any idea when will native faceting  off-heap fieldcache be available 
for multivalued fields? Most of my fields are multivalued so that's the big one 
for me.

Andy


On Thursday, June 19, 2014 3:46 PM, Yonik Seeley yo...@heliosearch.com wrote:
 


FYI, for those who want to try out the new native code faceting, this
is the first release containing it (for single valued string fields
only as of yet).

http://heliosearch.org/download/

Heliosearch v0.06

Features:
o  Heliosearch v0.06 is based on (and contains all features of)
Lucene/Solr 4.9.0
o  Native code faceting for single valued string fields.
    - Written in C++, statically compiled with gcc for Windows, Mac OS-X, Linux
    - static compilation avoids JVM hotspot warmup period,
mis-compilation bugs, and variations between runs
    - Improves performance over 2x
o  Top level Off-heap fieldcache for single valued string fields in nCache.
    - Improves sorting and faceting speed
    - Reduces garbage collection overhead
    - Eliminates FieldCache “insanity” that exists in Apache Solr from
faceting and sorting on the same field
o  Full request Parameter substitution / macro expansion, including
default value support.
o  frange query now only returns documents with a value.
     For example, in Apache Solr, {!frange l=-1 u=1 v=myfield} will
also return documents without a value since the numeric default value
of 0 lies within the range requested.
o  New JSON features via Noggit upgrade, allowing optional comments
(C/C++ and shell style), unquoted keys, and relaxed escaping that
allows one to backslash escape any character.


-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data

Re: Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread RadhaJayalakshmi
Thanks Shawn and Thanks Chris!!
Shawn your explanation was very clear and clarified my doubts

Chris,
The video was also very useful




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Segment-Count-of-my-Index-is-greater-than-the-Configured-MergeFactor-tp4142783p4142987.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr index pdf/word document with attachements

2014-06-19 Thread Prasi S
Hi ,
How can I index word / pdf documents with attachments to solr?

I have tried indexing a simple file with an attachment using tika, but it
does not index the attachment separately. Only the origiinal document is
getting indexed.



Thanks,
Prasi