Re: Retrieving multivalued field elements

2014-08-25 Thread Vivekanand Ittigi
Yes, you are right. It worked !

-Vivek


On Mon, Aug 25, 2014 at 7:39 PM, Ahmet Arslan 
wrote:

> Hi Vivek,
>
> how about this?
>
> Iterator iter = queryResponse.getResults().iterator();
>
> while (iter.hasNext()) {
>   SolrDocument resultDoc = iter.next();
>
>   Collection content =
>  resultDoc.getFieldValues("discussions");
> }
>
>
>
> On Monday, August 25, 2014 4:55 PM, Vivekanand Ittigi <
> vi...@biginfolabs.com> wrote:
> Hi,
>
> I've multivalued field and i want to display all array elements using solrj
> command.
>
> I used the command mentioned below but i'm able to retrieve only 1st
> element of the array.
>
> response.getResults().get(0).getFieldValueMap().get("discussions")
> Output: Creation Time - 2014-06-12 17:37:53.0
>
> NOTE: "discussions" is multivalued field in solr which contains
>
> 
>   Creation Time - 2014-06-12 17:37:53.0
>   Last modified Time - 2014-06-12 17:42:09.0
>   Comment - posting bug from risk flows ...posting comment from
> risk flows ...syncing comments ...
> 
>
> Is there any solrj API used for retrieving multivalued elements or its not
> possible..?
>
> -Vivek
>
>


how to combine solr join with boost in Edismax query?

2014-08-25 Thread jiag
Hello everyone :)

I have an index for groupId and one for product. For an input search
keyword, I only want to boost the result if the keyword appears in both
groupId and product indices.
I was able to get Solr join with fq to work with the following syntax:
example: q=searchTerm&fq={!join from=id_1 to=id_2
fromIndex=groupId}searchTerm

But I want to use solr join with bf or bq, does anyone have suggestions
on how to make it work?
(I also use qf, pf, and ps)

I tried the following but failed:
q=searchTerm&bf=({!join from=id_1 to=id_2
fromIndex=groupId}searchTerm)^100

q=searchTerm&bq=({!join from=id_1 to=id_2
fromIndex=groupId}searchTerm)^100

Many thanks
jia


Re: Copying a collection from one version of SOLR to another

2014-08-25 Thread Michael Della Bitta
Hi Philippe,

You can indeed copy an index like that. The problem probably arises because
4.9.0 is using core discovery by default. This wiki page will shed some
light:

https://wiki.apache.org/solr/Core%20Discovery%20%284.4%20and%20beyond%29

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions  | g+:
plus.google.com/appinions

w: appinions.com 


On Mon, Aug 25, 2014 at 4:31 AM,  wrote:

>
> Hello,
>
> is it possible to copy a collection created with SOLR 4.6.0 to a SOLR
> 4.9.0 server?
>
> I have just copied a collection called 'collection3', located in
> solr4.6.0/example/solr,  to solr4.9.0/example/solr, but to no avail,
> because my SOLR 4.9.0 Server's admin does not list it among the available
> cores.
>
> What am I doing wrong?
>
> Many thanks.
>
> Philippe
>
>


Re: Questions about caching and HDFSDirectory

2014-08-25 Thread Michael Della Bitta
Just in case someone else runs into this post, I think the following two
URLs have me sorted:

http://techkites.blogspot.com/2014/06/performance-tuning-and-optimization-for.html

http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/1.1.0-beta2/Cloudera-Search-User-Guide/csug_tuning_solr.html

If anyone has anything to add or correct about these two resources, please
let me know!


Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions  | g+:
plus.google.com/appinions

w: appinions.com 


On Fri, Aug 22, 2014 at 3:54 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> I'm looking at the Solr Reference Guide about Solr on HDFS, and it's
> bringing up a couple of quick questions for me. I guess I got spoiled by
> MMapDirectory and how magically it worked!
>
> 1. What is the minimum number of configuration parameters that enables
> HDFS block caching? It seems like I need to set XX:MaxDirectMemorySize when
> launching Solr, and then for every collection I want to be able to use
> caching with, I need to be sure that the Block Cache Settings are enabled
> based on defaults, save for
> solr.hdfs.blockcache.write.enabled should be false.
>
> 2. If I use solr.hdfs.blockcache.global, is the slab count still per core,
> or does it apply to everything, or is it no longer relevant?
>
> 3. Is there a sneaky way of ensuring a given collection or core loads
> first so no other cores accidentally override the global blockcache setting?
>
> 4. In terms of -XX:MaxDirectMemorySize and
> solr.hdfs.blockcache.slab.count, is there some percentage of system ram or
> some overall maximum beyond which it no longer achieves benefits, or can I
> actually just tune this to be nearly all of the ram minus the JVM's
> overhead and the RAM needed by the system? Or can it even be set higher
> than the overall RAM just to be sure?
>
> Thanks,
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions  | g+:
> plus.google.com/appinions
> 
> w: appinions.com 
>


Re: Integrate UIMA and DIH

2014-08-25 Thread paulparsons
I forgot to mention in the previous post that I changed the analysis engine
from

/org/apache/uima/desc/OverridingParamsExtServicesAE.xml

to

/org/apache/uima/desc/AggregateSentenceAE

In doing so, I forgot the '.xml' extension, which is what was causing the
error. It would be helpful if the error messages where a little more
descriptive!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Integrate-UIMA-and-DIH-tp4154576p4155075.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: embedded documents

2014-08-25 Thread Jack Krupansky
And a comparison to Elasticsearch would be helpful, since ES gets a lot of 
mileage from their super-easy JSON support. IOW, how much of the ES 
"advantage" is eliminated.


-- Jack Krupansky

-Original Message- 
From: Noble Paul

Sent: Monday, August 25, 2014 1:59 PM
To: solr-user@lucene.apache.org
Subject: Re: embedded documents

The simplest use case is to dump the entire json using split=/&f=/** . i am
planning to add an alias for the same (SOLR-6343) .

The nested docs is missing now and we will need to add it. A ticket needs
to be opened


On Mon, Aug 25, 2014 at 6:45 AM, Jack Krupansky 
wrote:


Thanks, Erik, but... I've read that Jira several times over the past
month, it is is far too cryptic for me to make any sense out of what it is
really trying to do. A simpler approach is clearly needed.

My perception of SOLR-6304 is not that it indexes a single JSON object as
a single Solr document, but that it generates a collection of separate
documents, somewhat analogous to Lucene block/child documents, but... not
quite.

I understood the request on this message thread to be the flattening of a
single nested JSON object to a single Solr document.

IMHO, we need to be trying to make Solr more automatic and more
approachable, not an even more complicated "toolkit".

-- Jack Krupansky

-Original Message- From: Erik Hatcher
Sent: Monday, August 25, 2014 9:32 AM

To: solr-user@lucene.apache.org
Subject: Re: embedded documents

Jack et al - there’s now this, which is available in the any-minute
release of Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304

Erik

On Aug 25, 2014, at 5:01 AM, Jack Krupansky 
wrote:

 That's a completely different concept, I think - the ability to return a
single field value as a structured JSON object in the "writer", rather 
than
simply "loading" from a nested JSON object and distributing the key 
values

to normal Solr fields.

-- Jack Krupansky

-Original Message- From: Bill Bell
Sent: Sunday, August 24, 2014 7:30 PM
To: solr-user@lucene.apache.org
Subject: Re: embedded documents

See my Jira. It supports it via json.fsuffix=_json&wt=json

http://mail-archives.apache.org/mod_mbox/lucene-dev/
201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E

Bill Bell
Sent from mobile


 On Aug 24, 2014, at 6:43 AM, "Jack Krupansky" 

wrote:

Indexing and query of raw JSON would be a valuable addition to Solr, so
maybe you could simply explain more precisely your data model and
transformation rules. For example, when multi-level nesting occurs, what
does your loader do?

Maybe if the fielld names were derived by concatenating the full path of
JSON key names, like titles_json.FR, field_naming nesting could be 
handled

in a fully automated manner.

I had been thinking of filing a Jira proposing exactly that, so that
even the most deeply nested JSON maps could be supported, although
combinations of arrays and maps would be problematic.

-- Jack Krupansky

-Original Message- From: Michael Pitsounis
Sent: Wednesday, August 20, 2014 7:14 PM
To: solr-user@lucene.apache.org
Subject: embedded documents

Hello everybody,

I had a requirement to store complicated json documents in solr.

i have modified the JsonLoader to accept complicated json documents with
arrays/objects as values.

It stores the object/array and then flatten it and  indexes the fields.

e.g  basic example document

{
 "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
title"} ,
 "id": 103,
 "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
}

It will store titles_json:{"FR":"This is the FR title" , "EN":"This is
the
EN title"}
and then index fields

titles.FR:"This is the FR title"
titles.EN:"This is the EN title"


Do you see any problems with this approach?



Regards,
Michael Pitsounis







--
-
Noble Paul 



Re: embedded documents

2014-08-25 Thread Michael Pitsounis
Hi Jack,


I uploaded the code for a friend here
http://www.solrfromscratch.com/2014/08/20/embedded-documents-in-solr/  [it
is not the latest code,  i will update it  in a couple of hours  ]

Multilevel nesting is supported,

in case of arrays e.g

personalities_json:[
  {id:5},
  {id:3}
]

initially I flattened to

personalities.0.id:5
personalities.1.id:3

BUT it is not that useful at the end, because you cannot query. So I
removed the index and I store them on a multivalue field

personalities.id:5
personalities.id:3






Regards,
M.





On Sun, Aug 24, 2014 at 2:43 PM, Jack Krupansky 
wrote:

> Indexing and query of raw JSON would be a valuable addition to Solr, so
> maybe you could simply explain more precisely your data model and
> transformation rules. For example, when multi-level nesting occurs, what
> does your loader do?
>
> Maybe if the fielld names were derived by concatenating the full path of
> JSON key names, like titles_json.FR, field_naming nesting could be handled
> in a fully automated manner.
>
> I had been thinking of filing a Jira proposing exactly that, so that even
> the most deeply nested JSON maps could be supported, although combinations
> of arrays and maps would be problematic.
>
> -- Jack Krupansky
>
> -Original Message- From: Michael Pitsounis
> Sent: Wednesday, August 20, 2014 7:14 PM
> To: solr-user@lucene.apache.org
> Subject: embedded documents
>
> Hello everybody,
>
> I had a requirement to store complicated json documents in solr.
>
> i have modified the JsonLoader to accept complicated json documents with
> arrays/objects as values.
>
> It stores the object/array and then flatten it and  indexes the fields.
>
> e.g  basic example document
>
>  {
>"titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
> title"} ,
>"id": 103,
>"guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
>   }
>
> It will store titles_json:{"FR":"This is the FR title" , "EN":"This is the
> EN title"}
> and then index fields
>
> titles.FR:"This is the FR title"
> titles.EN:"This is the EN title"
>
>
> Do you see any problems with this approach?
>
>
>
> Regards,
> Michael Pitsounis
>


Re: embedded documents

2014-08-25 Thread Noble Paul
The simplest use case is to dump the entire json using split=/&f=/** . i am
planning to add an alias for the same (SOLR-6343) .

The nested docs is missing now and we will need to add it. A ticket needs
to be opened


On Mon, Aug 25, 2014 at 6:45 AM, Jack Krupansky 
wrote:

> Thanks, Erik, but... I've read that Jira several times over the past
> month, it is is far too cryptic for me to make any sense out of what it is
> really trying to do. A simpler approach is clearly needed.
>
> My perception of SOLR-6304 is not that it indexes a single JSON object as
> a single Solr document, but that it generates a collection of separate
> documents, somewhat analogous to Lucene block/child documents, but... not
> quite.
>
> I understood the request on this message thread to be the flattening of a
> single nested JSON object to a single Solr document.
>
> IMHO, we need to be trying to make Solr more automatic and more
> approachable, not an even more complicated "toolkit".
>
> -- Jack Krupansky
>
> -Original Message- From: Erik Hatcher
> Sent: Monday, August 25, 2014 9:32 AM
>
> To: solr-user@lucene.apache.org
> Subject: Re: embedded documents
>
> Jack et al - there’s now this, which is available in the any-minute
> release of Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304
>
> Erik
>
> On Aug 25, 2014, at 5:01 AM, Jack Krupansky 
> wrote:
>
>  That's a completely different concept, I think - the ability to return a
>> single field value as a structured JSON object in the "writer", rather than
>> simply "loading" from a nested JSON object and distributing the key values
>> to normal Solr fields.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Bill Bell
>> Sent: Sunday, August 24, 2014 7:30 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: embedded documents
>>
>> See my Jira. It supports it via json.fsuffix=_json&wt=json
>>
>> http://mail-archives.apache.org/mod_mbox/lucene-dev/
>> 201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E
>>
>> Bill Bell
>> Sent from mobile
>>
>>
>>  On Aug 24, 2014, at 6:43 AM, "Jack Krupansky" 
>>> wrote:
>>>
>>> Indexing and query of raw JSON would be a valuable addition to Solr, so
>>> maybe you could simply explain more precisely your data model and
>>> transformation rules. For example, when multi-level nesting occurs, what
>>> does your loader do?
>>>
>>> Maybe if the fielld names were derived by concatenating the full path of
>>> JSON key names, like titles_json.FR, field_naming nesting could be handled
>>> in a fully automated manner.
>>>
>>> I had been thinking of filing a Jira proposing exactly that, so that
>>> even the most deeply nested JSON maps could be supported, although
>>> combinations of arrays and maps would be problematic.
>>>
>>> -- Jack Krupansky
>>>
>>> -Original Message- From: Michael Pitsounis
>>> Sent: Wednesday, August 20, 2014 7:14 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: embedded documents
>>>
>>> Hello everybody,
>>>
>>> I had a requirement to store complicated json documents in solr.
>>>
>>> i have modified the JsonLoader to accept complicated json documents with
>>> arrays/objects as values.
>>>
>>> It stores the object/array and then flatten it and  indexes the fields.
>>>
>>> e.g  basic example document
>>>
>>> {
>>>  "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
>>> title"} ,
>>>  "id": 103,
>>>  "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
>>> }
>>>
>>> It will store titles_json:{"FR":"This is the FR title" , "EN":"This is
>>> the
>>> EN title"}
>>> and then index fields
>>>
>>> titles.FR:"This is the FR title"
>>> titles.EN:"This is the EN title"
>>>
>>>
>>> Do you see any problems with this approach?
>>>
>>>
>>>
>>> Regards,
>>> Michael Pitsounis
>>>
>>
>>


-- 
-
Noble Paul


Re: Incorrect group.ngroups value

2014-08-25 Thread alxsss
Hi,

From the discussion it is not clear if this is a fixable bug in the case of 
documents being in different shards. If this is fixable could someone please 
direct me to the part of the code so that I could investigate.

Thanks.
Alex.

 

 

 

-Original Message-
From: Andrew Shumway 
To: solr-user 
Sent: Fri, Aug 22, 2014 8:15 am
Subject: RE: Incorrect group.ngroups value


The Co-location section of this document  
http://searchhub.org/2013/06/13/solr-cloud-document-routing/ 
might be of interest to you.  It mentions the need for using Solr Cloud routing 
to group documents in the same core so that grouping can work properly.

--Andrew Shumway


-Original Message-
From: Bryan Bende [mailto:bbe...@gmail.com] 
Sent: Friday, August 22, 2014 9:01 AM
To: solr-user@lucene.apache.org
Subject: Re: Incorrect group.ngroups value

Thanks Jim.

We've been using the composite id approach where we put group value as the 
leading portion of the id (i.e. groupValue!documentid), so I was expecting all 
of the documents for a given group to be in the same shard, but at least this 
gives me something to look into. I'm still suspicious of something changing 
between 4.6.1 and 4.8.1, because we've had the grouping implemented this way 
for 
a while, and only on the exact day we upgraded did someone bring this problem 
forward. I will keep investigating, thanks.


On Fri, Aug 22, 2014 at 9:18 AM, jim ferenczi 
wrote:

> Hi Bryan,
> This is a known limitations of the grouping.
> https://wiki.apache.org/solr/FieldCollapsing#RequestParameters
>
> group.ngroups:
>
>
> *WARNING: If this parameter is set to true on a sharded environment, 
> all the documents that belong to the same group have to be located in 
> the same shard, otherwise the count will be incorrect. If you are 
> using SolrCloud , consider 
> using "custom hashing"*
>
> Cheers,
> Jim
>
>
>
> 2014-08-21 21:44 GMT+02:00 Bryan Bende :
>
> > Is there any known issue with using group.ngroups in a distributed 
> > Solr using version 4.8.1 ?
> >
> > I recently upgraded a cluster from 4.6.1 to 4.8.1, and I'm noticing
> several
> > queries where ngroups will be more than the actual groups returned 
> > in the response. For example, ngroups will say 5, but then there 
> > will be 3
> groups
> > in the response. It is not happening on all queries, only some.
> >
>

 


Re: solr cloud going down repeatedly

2014-08-25 Thread Shawn Heisey
On 8/25/2014 4:23 AM, Jakov Sosic wrote:
> we ended up using cron to restart Tomcats every 7 days, each solr node
> per day... that way we avoid GC pauses.
>
> Until we figure things out in our dev environment and test GC
> optimizations, we will keep it this way.

If it's only doing a long GC pause once a week, I think I'd prefer to go
ahead and let it do the long GC pause.  It would be less of an
interruption than restarting Solr.

Or is it getting into a mode after several days where it goes crazy and
has a lot of major GC storms?  If that's the case, is it happening even
with the GC tuning parameters I gave you before?  I run my Solr
instances for months without issues.  Right now, my production Solr
instances have been running for 25 days.

Thanks,
Shawn



Re: embedded documents

2014-08-25 Thread Erik Hatcher
SOLR-6304 flattens a single JSON object into a single Solr document.  See 
Noble’s blog http://searchhub.org/2014/08/12/indexing-custom-json-data/ which 
states:

split : This parameter is required if you wish to transform the input 
JSON . This is the path at which the JSON must be split . If the entire JSON 
makes a single solr document , the path must be “/” . 

The purpose of that issue was exactly this, to make Solr more “approachable” in 
that arbitrary (albeit structured, not random) JSON could be ingested into Solr 
without writing code.  Mission accomplished there :)

Your mention of block/join does pique my curiosity though.  There may need to 
be some additional tweaks to make this JSON loading be able to index things 
just right for that feature.

Erik
  @ Lucidworks



On Aug 25, 2014, at 6:45 AM, Jack Krupansky  wrote:

> Thanks, Erik, but... I've read that Jira several times over the past month, 
> it is is far too cryptic for me to make any sense out of what it is really 
> trying to do. A simpler approach is clearly needed.
> 
> My perception of SOLR-6304 is not that it indexes a single JSON object as a 
> single Solr document, but that it generates a collection of separate 
> documents, somewhat analogous to Lucene block/child documents, but... not 
> quite.
> 
> I understood the request on this message thread to be the flattening of a 
> single nested JSON object to a single Solr document.
> 
> IMHO, we need to be trying to make Solr more automatic and more approachable, 
> not an even more complicated "toolkit".
> 
> -- Jack Krupansky
> 
> -Original Message- From: Erik Hatcher
> Sent: Monday, August 25, 2014 9:32 AM
> To: solr-user@lucene.apache.org
> Subject: Re: embedded documents
> 
> Jack et al - there’s now this, which is available in the any-minute release 
> of Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304
> 
> Erik
> 
> On Aug 25, 2014, at 5:01 AM, Jack Krupansky  wrote:
> 
>> That's a completely different concept, I think - the ability to return a 
>> single field value as a structured JSON object in the "writer", rather than 
>> simply "loading" from a nested JSON object and distributing the key values 
>> to normal Solr fields.
>> 
>> -- Jack Krupansky
>> 
>> -Original Message- From: Bill Bell
>> Sent: Sunday, August 24, 2014 7:30 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: embedded documents
>> 
>> See my Jira. It supports it via json.fsuffix=_json&wt=json
>> 
>> http://mail-archives.apache.org/mod_mbox/lucene-dev/201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E
>> 
>> Bill Bell
>> Sent from mobile
>> 
>> 
>>> On Aug 24, 2014, at 6:43 AM, "Jack Krupansky"  
>>> wrote:
>>> 
>>> Indexing and query of raw JSON would be a valuable addition to Solr, so 
>>> maybe you could simply explain more precisely your data model and 
>>> transformation rules. For example, when multi-level nesting occurs, what 
>>> does your loader do?
>>> 
>>> Maybe if the fielld names were derived by concatenating the full path of 
>>> JSON key names, like titles_json.FR, field_naming nesting could be handled 
>>> in a fully automated manner.
>>> 
>>> I had been thinking of filing a Jira proposing exactly that, so that even 
>>> the most deeply nested JSON maps could be supported, although combinations 
>>> of arrays and maps would be problematic.
>>> 
>>> -- Jack Krupansky
>>> 
>>> -Original Message- From: Michael Pitsounis
>>> Sent: Wednesday, August 20, 2014 7:14 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: embedded documents
>>> 
>>> Hello everybody,
>>> 
>>> I had a requirement to store complicated json documents in solr.
>>> 
>>> i have modified the JsonLoader to accept complicated json documents with
>>> arrays/objects as values.
>>> 
>>> It stores the object/array and then flatten it and  indexes the fields.
>>> 
>>> e.g  basic example document
>>> 
>>> {
>>> "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
>>> title"} ,
>>> "id": 103,
>>> "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
>>> }
>>> 
>>> It will store titles_json:{"FR":"This is the FR title" , "EN":"This is the
>>> EN title"}
>>> and then index fields
>>> 
>>> titles.FR:"This is the FR title"
>>> titles.EN:"This is the EN title"
>>> 
>>> 
>>> Do you see any problems with this approach?
>>> 
>>> 
>>> 
>>> Regards,
>>> Michael Pitsounis



Re: Integrate UIMA and DIH

2014-08-25 Thread paulparsons
I added default="true" to my updateRequestProcessorChain:



Now I'm getting errors when running the DIH:


ERROR org.apache.solr.core.SolrCore  – org.apache.solr.common.SolrException:
org.apache.uima.resource.ResourceInitializationException
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory.getInstance(UIMAUpdateRequestProcessorFactory.java:64)
at
org.apache.solr.update.processor.UpdateRequestProcessorChain.createProcessor(UpdateRequestProcessorChain.java:204)
at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:178)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1962)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:247)
at
org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:210)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.uima.resource.ResourceInitializationException
at
org.apache.lucene.analysis.uima.ae.BasicAEProvider.getAE(BasicAEProvider.java:58)
at
org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory.getInstance(UIMAUpdateRequestProcessorFactory.java:61)
... 35 more
Caused by: java.lang.NullPointerException
at org.apache.uima.util.XMLInputSource.(XMLInputSource.java:118)
at
org.apache.lucene.analysis.uima.ae.BasicAEProvider.getInputSource(BasicAEProvider.java:84)
at
org.apache.lucene.analysis.uima.ae.BasicAEProvider.getAE(BasicAEProvider.java:50)
... 36 more



I've looked at the source code that is pointed to, but can't determine what
the problem is. I've also noticed from other posts that people in the past
have had a similar problem with ResourceInitializationException, but there
doesn't seem to be any general solution.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Integrate-UIMA-and-DIH-tp4154576p4155039.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Retrieving multivalued field elements

2014-08-25 Thread Ahmet Arslan
Hi Vivek,

how about this?

Iterator iter = queryResponse.getResults().iterator();

    while (iter.hasNext()) {
      SolrDocument resultDoc = iter.next();

      Collection content =  resultDoc.getFieldValues("discussions");
}



On Monday, August 25, 2014 4:55 PM, Vivekanand Ittigi  
wrote:
Hi,

I've multivalued field and i want to display all array elements using solrj
command.

I used the command mentioned below but i'm able to retrieve only 1st
element of the array.

response.getResults().get(0).getFieldValueMap().get("discussions")
Output: Creation Time - 2014-06-12 17:37:53.0

NOTE: "discussions" is multivalued field in solr which contains


      Creation Time - 2014-06-12 17:37:53.0
      Last modified Time - 2014-06-12 17:42:09.0
      Comment - posting bug from risk flows ...posting comment from
risk flows ...syncing comments ...
    

Is there any solrj API used for retrieving multivalued elements or its not
possible..?

-Vivek



Splitting a backup index

2014-08-25 Thread Alexander Ramos Jardim
Hi,

I am making some performance tests with a backup index from one week ago.
For these tests I use a newly provisioned infrastructure identical to my
production environment.

As my production collections have 2 shards each, I begin the test putting
each backup shard in a different host so that I have scenario like:

collection1
host1 : backup_shard1
host2 : backup_shard2

Each shard contains 200k documents and is 4GB large.

Then, I issue a SPLITSHARD action at backup_shard1 with the command
http://host1:8983/solr/admin/collections?action=SPLITSHARD&shard=backup_shard1&colection=collection1

The shard is split in two as expected, but none of the documents from the
original shard go to the new ones even running a commit after the
SPLITSHARD action.

On another test, if I start both hosts with empty shards and begin to index
documents, I get the right behaviour for SPLITSHARD, i.e., I can see the
documents after the shards been split.

So, what am I doing wrong at my backup recovery? How can I take shards from
backup and split them on a different cluster than the original?

PS: solr version is 4.9.0

-- 
Alexander Ramos Jardim


Retrieving multivalued field elements

2014-08-25 Thread Vivekanand Ittigi
Hi,

I've multivalued field and i want to display all array elements using solrj
command.

I used the command mentioned below but i'm able to retrieve only 1st
element of the array.

response.getResults().get(0).getFieldValueMap().get("discussions")
Output: Creation Time - 2014-06-12 17:37:53.0

NOTE: "discussions" is multivalued field in solr which contains

 
  Creation Time - 2014-06-12 17:37:53.0
  Last modified Time - 2014-06-12 17:42:09.0
  Comment - posting bug from risk flows ...posting comment from
risk flows ...syncing comments ...


Is there any solrj API used for retrieving multivalued elements or its not
possible..?

-Vivek


Re: embedded documents

2014-08-25 Thread Jack Krupansky
Thanks, Erik, but... I've read that Jira several times over the past month, 
it is is far too cryptic for me to make any sense out of what it is really 
trying to do. A simpler approach is clearly needed.


My perception of SOLR-6304 is not that it indexes a single JSON object as a 
single Solr document, but that it generates a collection of separate 
documents, somewhat analogous to Lucene block/child documents, but... not 
quite.


I understood the request on this message thread to be the flattening of a 
single nested JSON object to a single Solr document.


IMHO, we need to be trying to make Solr more automatic and more 
approachable, not an even more complicated "toolkit".


-- Jack Krupansky

-Original Message- 
From: Erik Hatcher

Sent: Monday, August 25, 2014 9:32 AM
To: solr-user@lucene.apache.org
Subject: Re: embedded documents

Jack et al - there’s now this, which is available in the any-minute release 
of Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304


Erik

On Aug 25, 2014, at 5:01 AM, Jack Krupansky  wrote:

That's a completely different concept, I think - the ability to return a 
single field value as a structured JSON object in the "writer", rather 
than simply "loading" from a nested JSON object and distributing the key 
values to normal Solr fields.


-- Jack Krupansky

-Original Message- From: Bill Bell
Sent: Sunday, August 24, 2014 7:30 PM
To: solr-user@lucene.apache.org
Subject: Re: embedded documents

See my Jira. It supports it via json.fsuffix=_json&wt=json

http://mail-archives.apache.org/mod_mbox/lucene-dev/201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E

Bill Bell
Sent from mobile


On Aug 24, 2014, at 6:43 AM, "Jack Krupansky"  
wrote:


Indexing and query of raw JSON would be a valuable addition to Solr, so 
maybe you could simply explain more precisely your data model and 
transformation rules. For example, when multi-level nesting occurs, what 
does your loader do?


Maybe if the fielld names were derived by concatenating the full path of 
JSON key names, like titles_json.FR, field_naming nesting could be 
handled in a fully automated manner.


I had been thinking of filing a Jira proposing exactly that, so that even 
the most deeply nested JSON maps could be supported, although 
combinations of arrays and maps would be problematic.


-- Jack Krupansky

-Original Message- From: Michael Pitsounis
Sent: Wednesday, August 20, 2014 7:14 PM
To: solr-user@lucene.apache.org
Subject: embedded documents

Hello everybody,

I had a requirement to store complicated json documents in solr.

i have modified the JsonLoader to accept complicated json documents with
arrays/objects as values.

It stores the object/array and then flatten it and  indexes the fields.

e.g  basic example document

{
 "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
title"} ,
 "id": 103,
 "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
}

It will store titles_json:{"FR":"This is the FR title" , "EN":"This is 
the

EN title"}
and then index fields

titles.FR:"This is the FR title"
titles.EN:"This is the EN title"


Do you see any problems with this approach?



Regards,
Michael Pitsounis




Re: embedded documents

2014-08-25 Thread Erik Hatcher
Jack et al - there’s now this, which is available in the any-minute release of 
Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304

Erik

On Aug 25, 2014, at 5:01 AM, Jack Krupansky  wrote:

> That's a completely different concept, I think - the ability to return a 
> single field value as a structured JSON object in the "writer", rather than 
> simply "loading" from a nested JSON object and distributing the key values to 
> normal Solr fields.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Bill Bell
> Sent: Sunday, August 24, 2014 7:30 PM
> To: solr-user@lucene.apache.org
> Subject: Re: embedded documents
> 
> See my Jira. It supports it via json.fsuffix=_json&wt=json
> 
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E
> 
> Bill Bell
> Sent from mobile
> 
> 
>> On Aug 24, 2014, at 6:43 AM, "Jack Krupansky"  
>> wrote:
>> 
>> Indexing and query of raw JSON would be a valuable addition to Solr, so 
>> maybe you could simply explain more precisely your data model and 
>> transformation rules. For example, when multi-level nesting occurs, what 
>> does your loader do?
>> 
>> Maybe if the fielld names were derived by concatenating the full path of 
>> JSON key names, like titles_json.FR, field_naming nesting could be handled 
>> in a fully automated manner.
>> 
>> I had been thinking of filing a Jira proposing exactly that, so that even 
>> the most deeply nested JSON maps could be supported, although combinations 
>> of arrays and maps would be problematic.
>> 
>> -- Jack Krupansky
>> 
>> -Original Message- From: Michael Pitsounis
>> Sent: Wednesday, August 20, 2014 7:14 PM
>> To: solr-user@lucene.apache.org
>> Subject: embedded documents
>> 
>> Hello everybody,
>> 
>> I had a requirement to store complicated json documents in solr.
>> 
>> i have modified the JsonLoader to accept complicated json documents with
>> arrays/objects as values.
>> 
>> It stores the object/array and then flatten it and  indexes the fields.
>> 
>> e.g  basic example document
>> 
>> {
>>  "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
>> title"} ,
>>  "id": 103,
>>  "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
>> }
>> 
>> It will store titles_json:{"FR":"This is the FR title" , "EN":"This is the
>> EN title"}
>> and then index fields
>> 
>> titles.FR:"This is the FR title"
>> titles.EN:"This is the EN title"
>> 
>> 
>> Do you see any problems with this approach?
>> 
>> 
>> 
>> Regards,
>> Michael Pitsounis 
> 



Re: Help with StopFilterFactory

2014-08-25 Thread Jack Krupansky
Interesting. First, an apology for an error in my e-book - it says that the 
enablePositionIncrements parameter for the stop filter defaults to "false", 
but it actually defaults to "true". The question mark represents a "position 
increment". In your case you don't want position increments, so add the 
enablePositionIncrements="false" parameter to the stop filter, and be sure 
to reindex your data. The position increment leaves a "hole" where each stop 
word was removed. The question mark represents the hole. All bets are off as 
to what phrase query does when the phrase starts with a hole. I think the 
basic idea is that there must be some term in the index at that position 
that can be "skipped".


This is actually a change in behavior, which occurred as a side effect of 
LUCENE-4963 in 4.4. The default for enablePositionIncrements was false, but 
that release changed it to true.


I suspect that I wrote that section of my e-book before 4.4 came out. 
Unfortunately, the change is not well documented - nothing in the Javadoc, 
and this is another example of where an underlying change in Lucene that 
impacts Solr users is not well highlighted for Solr users. Sorry about that.


In any case, try adding enablePositionIncrements="false", reindex, and see 
what happens.


-- Jack Krupansky

-Original Message- 
From: heaven

Sent: Monday, August 25, 2014 3:37 AM
To: solr-user@lucene.apache.org
Subject: Re: Help with StopFilterFactory

A valid search:
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
An Invalid search:
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww

What weird I found is that the valid query has:
"parsedquery_toString": "+(url_words_ngram:\"twitter com zer0sleep\")"
And the invalid one has:
"parsedquery_toString": "+(url_words_ngram:\"? twitter com zer0sleep\")"

So "https" part was replaced with a "?".



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4154957.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Exact search with special characters

2014-08-25 Thread Jack Krupansky
To be honest, I'm not precisely sure what Google is really doing under the 
hood since there is no detailed spec publically available. We know that 
quotes do force a phrase searchin Google, but do they disable stemming or 
preserve case and special characters? Unknown. Although, my PERCEPTION of 
Google is that it does disable stemming but continues to be case insensitive 
and ignore special characters in quoted phrases, but I don't see that 
behavior documented for search help in Google. IOW, trying to fall back on a 
precise definition from Google won't help us here. IOW, we don't have a 
clear view of "Exact search with special characters" for Google itself.


Bottom line: If you want to search both with and without special characters, 
that will have to be done with separate fields with separate analyzers.


You could use the combination of the keyword tokenizer and the ngram filter 
(at index time only) to support what YOU SEEM to be calling "exact match", 
but then you will need to specify that separate field name in addition to 
quoting the phrase. Or, just use a string field and then do wildcard or 
regex queries on that field for whatever degree of "exactness" you require.


-- Jack Krupansky

-Original Message- 
From: Shay Sofer

Sent: Monday, August 25, 2014 8:02 AM
To: solr-user@lucene.apache.org
Subject: RE: Exact search with special characters

Hi,

Thanks for your reply.

I thought that google search work the same (quotes stand for exact match).

Example for my demands:
Objects:
- test host
- test_host
-test $host
-test-host

When I'll search for test host I'll get all above  results.

When I'll search for "test host" Ill get only test host

Also, when search for partial string like test / host I'll get all above 
results.


Thanks.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Sunday, August 24, 2014 3:34 PM
To: solr-user@lucene.apache.org
Subject: Re: Exact search with special characters

What precisely do you mean by the term "exact search". I mean, Solr (and
Lucene) do not have that concept for tokenized text fields.

Or did you simply mean "quoted phrase". In which case, you need to be aware 
that all the quotes do is assure that the terms occur in that order or in 
close proximity according to the default or specified "phrase slop"
distance. But each term is still analyzed according to the analyzer for the 
field.


Technically, Lucene will in fact analyze the full quoted phrase as one 
stream, which for non-tokenized fields will be one term, but for any 
tokenized fields which split on white space, the phrase will be broken into 
separate tokens and special characters will tend to be removed as well. The 
keyword tokenizer will indeed treat the entire phrase as a single token, and 
the white space tokenizer will preserve special characters, but the standard 
tokenizer will not preserve either white space or special characters.


Nominally, the keyword tokenizer does generate a single term at least at the 
tokenization stage, but the world delimiter filter then splits individual 
terms into multiple terms, thus guaranteeing that a phrase with white space 
will be multiple terms and special characters are removed as well.


The other technicality is that quoting a phrase does prevent the phrase from 
being interpreted as query parser syntax, such as AND and OR operators or 
treating special characters as query parser operators.


But, the fact remains that a quoted phrase is not treated as an "exact"
string literal for any normal tokenized fields.

Out of curiosity, what references have lead you to believe that a quoted 
phrase is an "exact match"?


Use a "string" (not "tokenized text") field if you wish to make an "exact 
match" on a literal string, but the concept of "exact match" is not 
supported for tokenized and filtered text fields.


So, please describe, in plain English, plus examples, exactly what you 
expect your analyzer to do, both in terms of how it treats text to be 
indexed and how you expect to be able to query that text.


-- Jack Krupansky

-Original Message-
From: Shay Sofer
Sent: Sunday, August 24, 2014 5:58 AM
To: solr-user@lucene.apache.org
Subject: Exact search with special characters

Hi all,

I have a docs that's indexed by text field with mention schema.

I have those docs names:

-  Test host

-  Test_host

-  Test-host

-  Test $host

When I'm trying to do exact search like: "test host"
All the results from above are shown as a results.

How can I use exact match so I'll will get only one result?

I prefer to do my changes in search time but if I need to change my schema 
please offer that.


Thanks,
Shay.


This is my schema:
   
   
   
   
   
   
   
   
   
   
   
   



Email secured by Check Point 



RE: Exact search with special characters

2014-08-25 Thread Shay Sofer
Hi,

Thanks for your reply.

I thought that google search work the same (quotes stand for exact match).

Example for my demands: 
Objects:
- test host
- test_host
-test $host
-test-host

When I'll search for test host I'll get all above  results.

When I'll search for "test host" Ill get only test host

Also, when search for partial string like test / host I'll get all above 
results.

Thanks.

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Sunday, August 24, 2014 3:34 PM
To: solr-user@lucene.apache.org
Subject: Re: Exact search with special characters 

What precisely do you mean by the term "exact search". I mean, Solr (and
Lucene) do not have that concept for tokenized text fields.

Or did you simply mean "quoted phrase". In which case, you need to be aware 
that all the quotes do is assure that the terms occur in that order or in close 
proximity according to the default or specified "phrase slop" 
distance. But each term is still analyzed according to the analyzer for the 
field.

Technically, Lucene will in fact analyze the full quoted phrase as one stream, 
which for non-tokenized fields will be one term, but for any tokenized fields 
which split on white space, the phrase will be broken into separate tokens and 
special characters will tend to be removed as well. The keyword tokenizer will 
indeed treat the entire phrase as a single token, and the white space tokenizer 
will preserve special characters, but the standard tokenizer will not preserve 
either white space or special characters.

Nominally, the keyword tokenizer does generate a single term at least at the 
tokenization stage, but the world delimiter filter then splits individual terms 
into multiple terms, thus guaranteeing that a phrase with white space will be 
multiple terms and special characters are removed as well.

The other technicality is that quoting a phrase does prevent the phrase from 
being interpreted as query parser syntax, such as AND and OR operators or 
treating special characters as query parser operators.

But, the fact remains that a quoted phrase is not treated as an "exact" 
string literal for any normal tokenized fields.

Out of curiosity, what references have lead you to believe that a quoted phrase 
is an "exact match"?

Use a "string" (not "tokenized text") field if you wish to make an "exact 
match" on a literal string, but the concept of "exact match" is not supported 
for tokenized and filtered text fields.

So, please describe, in plain English, plus examples, exactly what you expect 
your analyzer to do, both in terms of how it treats text to be indexed and how 
you expect to be able to query that text.

-- Jack Krupansky

-Original Message-
From: Shay Sofer
Sent: Sunday, August 24, 2014 5:58 AM
To: solr-user@lucene.apache.org
Subject: Exact search with special characters

Hi all,

I have a docs that's indexed by text field with mention schema.

I have those docs names:

-  Test host

-  Test_host

-  Test-host

-  Test $host

When I'm trying to do exact search like: "test host"
All the results from above are shown as a results.

How can I use exact match so I'll will get only one result?

I prefer to do my changes in search time but if I need to change my schema 
please offer that.

Thanks,
Shay.


This is my schema:















Email secured by Check Point


Re: embedded documents

2014-08-25 Thread Jack Krupansky
That's a completely different concept, I think - the ability to return a 
single field value as a structured JSON object in the "writer", rather than 
simply "loading" from a nested JSON object and distributing the key values 
to normal Solr fields.


-- Jack Krupansky

-Original Message- 
From: Bill Bell

Sent: Sunday, August 24, 2014 7:30 PM
To: solr-user@lucene.apache.org
Subject: Re: embedded documents

See my Jira. It supports it via json.fsuffix=_json&wt=json

http://mail-archives.apache.org/mod_mbox/lucene-dev/201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E

Bill Bell
Sent from mobile


On Aug 24, 2014, at 6:43 AM, "Jack Krupansky"  
wrote:


Indexing and query of raw JSON would be a valuable addition to Solr, so 
maybe you could simply explain more precisely your data model and 
transformation rules. For example, when multi-level nesting occurs, what 
does your loader do?


Maybe if the fielld names were derived by concatenating the full path of 
JSON key names, like titles_json.FR, field_naming nesting could be handled 
in a fully automated manner.


I had been thinking of filing a Jira proposing exactly that, so that even 
the most deeply nested JSON maps could be supported, although combinations 
of arrays and maps would be problematic.


-- Jack Krupansky

-Original Message- From: Michael Pitsounis
Sent: Wednesday, August 20, 2014 7:14 PM
To: solr-user@lucene.apache.org
Subject: embedded documents

Hello everybody,

I had a requirement to store complicated json documents in solr.

i have modified the JsonLoader to accept complicated json documents with
arrays/objects as values.

It stores the object/array and then flatten it and  indexes the fields.

e.g  basic example document

{
  "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
title"} ,
  "id": 103,
  "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
 }

It will store titles_json:{"FR":"This is the FR title" , "EN":"This is the
EN title"}
and then index fields

titles.FR:"This is the FR title"
titles.EN:"This is the EN title"


Do you see any problems with this approach?



Regards,
Michael Pitsounis 




Re: solr cloud going down repeatedly

2014-08-25 Thread Jakov Sosic

On 08/19/2014 04:58 PM, Shawn Heisey wrote:

On 8/19/2014 3:12 AM, Jakov Sosic wrote:

Thank you for your comment.

How did you test these settings? I mean, that's a lot of tuning and I
would like to set up some test environment to be certain this is what
I want...


I included a section on tools when I wrote this page:

http://wiki.apache.org/solr/SolrPerformanceProblems#GC_pause_problems


Thanks,


we ended up using cron to restart Tomcats every 7 days, each solr node 
per day... that way we avoid GC pauses.


Until we figure things out in our dev environment and test GC 
optimizations, we will keep it this way.




Copying a collection from one version of SOLR to another

2014-08-25 Thread phiroc

Hello,

is it possible to copy a collection created with SOLR 4.6.0 to a SOLR 4.9.0 
server?

I have just copied a collection called 'collection3', located in 
solr4.6.0/example/solr,  to solr4.9.0/example/solr, but to no avail, because my 
SOLR 4.9.0 Server's admin does not list it among the available cores.

What am I doing wrong?

Many thanks.

Philippe



Re: How do I get index size and datasize

2014-08-25 Thread Aurélien MAZOYER

Hi,

Have a look the 'data' directory in your solr_home.
.fdt and fdx. files are used to store the data of stored field. You can 
consider the size of the other files as the size Solr uses for its index.
You can have a look to 
http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/codecs/lucene49/package-summary.html#file-names 
to have more information.


Regards,


Le 25/08/2014 09:40, Ramprasad Padmanabhan a écrit :

I have solr working for my stats pages. When I run the index I need to know
how much of the size occupied by solr is used for index and how much is
used for storing non indexed data





How do I get index size and datasize

2014-08-25 Thread Ramprasad Padmanabhan
I have solr working for my stats pages. When I run the index I need to know
how much of the size occupied by solr is used for index and how much is
used for storing non indexed data


Re: Help with StopFilterFactory

2014-08-25 Thread heaven
A valid search:
http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za
An Invalid search:
http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww

What weird I found is that the valid query has:
"parsedquery_toString": "+(url_words_ngram:\"twitter com zer0sleep\")"
And the invalid one has:
"parsedquery_toString": "+(url_words_ngram:\"? twitter com zer0sleep\")"

So "https" part was replaced with a "?".



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4154957.html
Sent from the Solr - User mailing list archive at Nabble.com.