Re: User-defined properties and configsets

2016-01-28 Thread Georg Sorst
Any takers?

Georg Sorst  schrieb am So., 24. Jän. 2016 00:22:

> Hi list!
>
> I've just started playing with Solr 5 (upgrading from Solr 4) and want to
> use configsets. I'm currently struggling with how to use user-defined
> properties and configsets together.
>
> My solrconfig.xml contains a few properties. Previously these were in a
> solrcore.properties and thus were properly loaded and substituted by Solr.
>
> Now I've moved my configuration to a configset (as I may need to create
> several cores with the same config). When I create a core with
> http://localhost:8983/solr/admin/cores?action=CREATE=mycore=myconfigset
>  Solr
> tells me:
>
> Caused by: org.apache.solr.common.SolrException: Error loading solr config
> from //configsets/myconfigset/conf/solrconfig.xml
> at
> org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:186)
> at
> org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:94)
> at
> org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:74)
> ... 30 more
> Caused by: org.apache.solr.common.SolrException: No system property or
> default value specified for  value:
> 
> at
> org.apache.solr.util.PropertiesUtil.substituteProperty(PropertiesUtil.java:66)
> ...
>
> Where should I put my properties so Solr can load them when I create a new
> core using this config set? From what I read I could specify them as system
> properties (-Dmyproperty=...) but I'd rather keep them in a file that I can
> check in.
>
> Thanks!
> Georg
>
>
>


Re: implement exact match for one of the search fields only?

2016-01-28 Thread Jan Høydahl
Hi

Please look at my github repo with a template for a field type allowing exact 
match. Typical use is with disMax query parser and the “pf” param.
See https://github.com/cominvent/exactmatch

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 28. jan. 2016 kl. 10.52 skrev Derek Poh :
> 
> Hi
> 
> First of all, sorry for the long post.
> 
> How do I implement or structured the query such that one of the search fields 
> is an exact phrase match while the rest of the search fields can be exact or 
> partial matches? Is this possible?
> 
> I have the following search fields
> - P_VeryShortDescription
> - P_ShortDescription
> - P_CatConcatKeyword
> - spp_keyword_exact
> 
> For the spp_keyword_exact field, I want to apply an exact match to it.
> 
> I have a document with the following information. If I search for 'dvd', this 
> document should not match. However if I search for 'dvd bracket', this 
> document should match.
> Right now when I search for 'dvd', it is not return, which is correct.
> I want it to be return when I search for 'dvd bracket' but it is not.
> I try enclosing it in double quotes "dvd bracket" but it is not return. Then 
> again I can't enclosed the search terms in double quotes "dvd bracket" as 
> those documents with the word 'dvd' and 'bracket' in the other fields will 
> not be match, am I right?
> 
> doc:
> 

Query cache with grouping

2016-01-28 Thread Robert Brown

Hi,

During some testing, I've found that the queryResultCache is not used 
when I use grouping.


Is there another cache that is being used in this scenario, if so, 
which, and how can I ensure they'[re providing a real benefit?


Thanks,
Rob



Re: Solr partial date range search

2016-01-28 Thread Alessandro Benedetti
I agree with Erick,
converting dates into String is really a bad idea.

This "custom component" makes me curious, I assume you have :
Front End - Search-api -Solr or something similar.
If you want the front end to send the partial date, then the search-api can
handle the conversion.
If you want the front end to do the conversion, do at that time.
I can't see any problem in that, part if you are manually using Solr
without any middle layer ( which would surprise me) .
Can you give us more explanation  ?
This "custom component" sounds me wrong, how are you currently using solr ?

Cheers

On 28 January 2016 at 06:09, Jeyaprakash Singarayar 
wrote:

> Hi Sriram,
>
> Add the tag 'propertyWriter' directly under the 'dataConfig' tag. The
> property "last_index_time" is converted to text and stored in the
> properties file and is available for the next import as the variable
> '${dih.last_index_time}' . This tag gives control over how this properties
> file is written.
>
> Its available in all 4.X versions.
>
> Hope it helps.
>
> On Thu, Jan 28, 2016 at 1:41 AM, vsriram30  wrote:
>
> > I am actually using one such component to take in the partial dates like
> > 2015-10 and create full UTC dates out of it and query using that. But
> since
> > I was checking on that wiki about partial date search and since I
> couldn't
> > find that it is available only from 5.x, I was curious to know if by some
> > way I can make it work in 4.6.1 without need of my custom component.
> >
> > Thanks,
> > Sriram
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Solr-partial-date-range-search-tp4253226p4253649.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Adding new documents to the search results and rescoring. Is it possible?

2016-01-28 Thread Jack Krupansky
Please provide a little more context.

How exactly are new documents getting added to a result set? I mean, each
query has its own result set, so there really isn't any way for a new query
to impact the results of a previous query.

Scores are always calculated fresh on each query, so there would never be a
need to "re" score them. Are you simply looking for a way to shift/boost
the scores somehow? Again, tell us more about what you are actually trying
to achieve.

-- Jack Krupansky

On Thu, Jan 28, 2016 at 9:52 AM, vitaly bulgakov 
wrote:

> I have Solr 4.2. Is it possible to rescore results after adding new
> documents
> to the result set?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Adding-new-documents-to-the-search-results-and-rescoring-Is-it-possible-tp4253859.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr cannot return result when query with # * like title:#7654321*

2016-01-28 Thread Jack Krupansky
Thanks. This is what Yonik was referring to - that # is a special URL
syntax character which signifies that the text after the # is what is known
as a fragment identifier, which is separated from the path and query
parameters of the URL. The Solr query is simply one URL query parameter
(=value). You need to escape the #, such as %23. But if you are using
SolrJ, the escaping should handled by the SolrJ API itself.

See:
https://en.wikipedia.org/wiki/Fragment_identifier
https://tools.ietf.org/html/rfc3986

Just to be super clear, how exactly are you sending the query to Solr - if
using curl, please post the full curl command.


-- Jack Krupansky

On Thu, Jan 28, 2016 at 1:03 AM, diyun2008  wrote:

> The query is rather simple:
> http://127.0.0.1:8080/solr/collection1/select?q=title:#7654321*
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-cannot-return-result-when-query-with-like-title-7654321-tp4253541p4253760.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: implement exact match for one of the search fields only?

2016-01-28 Thread Erick Erickson
bq: if you are interested phrase query, you should use String field

If you do this, you will NOT be able to search within the string. I.e.
if the doc field is "my dog has fleas" you cannot match
"dog has" with a string-based field.

If you want to match the _entire_ string or you want prefix-only
matching, then string might work, i.e. if you _only_ want to be able
to match

"my dog has fleas"
"my dog*"
but not
"dog has fleas".

On to the root question though.

I really think you want to look at edismax. What you're trying to do
is apply the same search term to individual fields. In particular,
the pf parameter will automatically apply the search terms _as a phrase_
against the field specified, relieving you of having to enclose things
in quotes.

The manual way of doing this would be to construct an elaborate query, like
q=spp_keyword_exact:"dvd bracket" OR P_ShortDescription:(dvd bracket) OR

NOTE: the parens are necessary or the last part of the above would be
parsed as
P_ShortDescription:dvd default_searchfield:bracket

And the =query trick will show you exactly how things are actually
searched, it's invaluable.

Best,
Erick

On Thu, Jan 28, 2016 at 5:08 AM, Mugeesh Husain  wrote:
> Hi,
> if you are interested phrase query, you should use String field instead of
> text field in schema like as
>  
>
> this will solved you problem.
>
> if you are missing anything else let share
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/implement-exact-match-for-one-of-the-search-fields-only-tp4253786p4253827.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud replicas out of sync

2016-01-28 Thread Tomás Fernández Löbbe
Maybe you are hitting the reordering issue described in SOLR-8129?

Tomás

On Wed, Jan 27, 2016 at 11:32 AM, David Smith 
wrote:

> Sure.  Here is our SolrCloud cluster:
>
>+ Three (3) instances of Zookeeper on three separate (physical)
> servers.  The ZK servers are beefy and fairly recently built, with 2x10
> GigE (bonded) Ethernet connectivity to the rest of the data center.  We
> recognize importance of the stability and responsiveness of ZK to the
> stability of SolrCloud as a whole.
>
>+ 364 collections, all with single shards and a replication factor of
> 3.  Currently housing only 100,000,000 documents in aggregate.  Expected to
> grow to 25 billion+.  The size of a single document would be considered
> “large”, by the standards of what I’ve seen posted elsewhere on this
> mailing list.
>
> We are always open to ZK recommendations from you or anyone else,
> particularly for running a SolrCloud cluster of this size.
>
> Kind Regards,
>
> David
>
>
>
> On 1/27/16, 12:46 PM, "Jeff Wartes"  wrote:
>
> >
> >If you can identify the problem documents, you can just re-index those
> after forcing a sync. Might save a full rebuild and downtime.
> >
> >You might describe your cluster setup, including ZK. it sounds like
> you’ve done your research, but improper ZK node distribution could
> certainly invalidate some of Solr’s assumptions.
> >
> >
> >
> >
> >On 1/27/16, 7:59 AM, "David Smith"  wrote:
> >
> >>Jeff, again, very much appreciate your feedback.
> >>
> >>It is interesting — the article you linked to by Shalin is exactly why
> we picked SolrCloud over ES, because (eventual) consistency is critical for
> our application and we will sacrifice availability for it.  To be clear,
> after the outage, NONE of our three replicas are correct or complete.
> >>
> >>So we definitely don’t have CP yet — our very first network outage
> resulted in multiple overlapped lost updates.  As a result, I can’t pick
> one replica and make it the new “master”.  I must rebuild this collection
> from scratch, which I can do, but that requires downtime which is a problem
> in our app (24/7 High Availability with few maintenance windows).
> >>
> >>
> >>So, I definitely need to “fix” this somehow.  I wish I could outline a
> reproducible test case, but as the root cause is likely very tight timing
> issues and complicated interactions with Zookeeper, that is not really an
> option.  I’m happy to share the full logs of all 3 replicas though if that
> helps.
> >>
> >>I am curious though if the thoughts have changed since
> https://issues.apache.org/jira/browse/SOLR-5468 of seriously considering
> a “majority quorum” model, with rollback?  Done properly, this should be
> free of all lost update problems, at the cost of availability.  Some
> SolrCloud users (like us!!!) would gladly accept that tradeoff.
> >>
> >>Regards
> >>
> >>David
> >>
> >>
>
>


Re: implement exact match for one of the search fields only?

2016-01-28 Thread Jan Høydahl
Depends on what exactly you try to do. I think the Github README explains in 
what situations my solution excels.
Especially if you do not have control over the client application, you simply 
get a q=foo, then such a 
setup will allow you to boost exact matches very easily.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 28. jan. 2016 kl. 16.28 skrev Alessandro Benedetti :
> 
> Jan,
> I admit I took a brief look, but what are the benefit of using your
> strategy instead of an additional not tokenised ( keywordTokenized) copy
> field ?
> 
> Cheers
> 
> On 28 January 2016 at 15:22, Jan Høydahl  wrote:
> 
>> Hi
>> 
>> Please look at my github repo with a template for a field type allowing
>> exact match. Typical use is with disMax query parser and the “pf” param.
>> See https://github.com/cominvent/exactmatch
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>>> 28. jan. 2016 kl. 10.52 skrev Derek Poh :
>>> 
>>> Hi
>>> 
>>> First of all, sorry for the long post.
>>> 
>>> How do I implement or structured the query such that one of the search
>> fields is an exact phrase match while the rest of the search fields can be
>> exact or partial matches? Is this possible?
>>> 
>>> I have the following search fields
>>> - P_VeryShortDescription
>>> - P_ShortDescription
>>> - P_CatConcatKeyword
>>> - spp_keyword_exact
>>> 
>>> For the spp_keyword_exact field, I want to apply an exact match to it.
>>> 
>>> I have a document with the following information. If I search for 'dvd',
>> this document should not match. However if I search for 'dvd bracket', this
>> document should match.
>>> Right now when I search for 'dvd', it is not return, which is correct.
>>> I want it to be return when I search for 'dvd bracket' but it is not.
>>> I try enclosing it in double quotes "dvd bracket" but it is not return.
>> Then again I can't enclosed the search terms in double quotes "dvd bracket"
>> as those documents with the word 'dvd' and 'bracket' in the other fields
>> will not be match, am I right?
>>> 
>>> doc:
>>> 

Mysql data import issue

2016-01-28 Thread vsriram30
Hi,
I am using Solr 4.6.1 and I am trying to import my data from mysql to solr.

In mysql, I have a table with columns,
id, legacyid, otherfields...

In solr I have columns : id, other fields. I want to map the legacyid field
in my mysql table with Solr'r id column and skip the "id" field of mysql
while doing import. Hence I have a mapping,


But still I get one to one mapping of my mysql id field to solr's id field.
Can you please let me know how to prevent this from happening?

I even mapped id field of mysql to empty solr field. 

But still I get mysql id field to solr id field mapping. Please let me know
how to prevent this from happening. 

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998.html
Sent from the Solr - User mailing list archive at Nabble.com.


O convite de Paulo Marinho está aguardando sua resposta

2016-01-28 Thread Paulo Marinho via LinkedIn
Paulo Marinho would like to connect on LinkedIn. How would you like to respond?

Accept: 
https://www.linkedin.com/e/v2?e=3uh06e-ijyannxz-dm=preRegInvite=first_guest_reminder_01=14=hero=accept_text=3wDALIqe=6097201761255247872

View Paulo Marinhos profile: 
https://www.linkedin.com/e/v2?e=3uh06e-ijyannxz-dm=preRegInvite=first_guest_reminder_01=3=hero=profile_text=3wDALIqe=6097201761255247872

Gostaria de adicionar você à minha rede profissional no LinkedIn.





Você recebeu um convite para se conectar. O LinkedIn utiliza seu endereço de 
e-mail para fazer sugestões a nossos usuários em recursos como Pessoas que 
talvez você conheça. Cancelar inscrição: 
https://www.linkedin.com/e/v2?e=3uh06e-ijyannxz-dm=lun=AQFmrFVhGEd36Q=first_guest_reminder_01=16=unsub=HTML=3uh06e-ijyannxz-dm=AQGUGi7cU4xNiVKJoDlgei7mzKuXTlukloXbI4cEXRWQT4pUwm2FFKKWosxgLoMFKNI4nDMInI9S8qzpC-m1iw

Este e-mail foi enviado para solr-user@lucene.apache.org.

If you need assistance or have questions, please contact LinkedIn Customer 
Service: 
https://www.linkedin.com/e/v2?e=3uh06e-ijyannxz-dm=customerServiceUrl=first_guest_reminder_01

 2016 LinkedIn Corporation, 2029 Stierlin Court, Mountain View, CA 94043. 
LinkedIn e a logomarca do LinkedIn são marcas registradas da LinkedIn.

Re: implement exact match for one of the search fields only?

2016-01-28 Thread Jack Krupansky
A simple boost query (bq) might do the trick, using edismax:

q=dvd bracket
bq=spp_keyword_exact:"dvd bracket"^100
qf=P_VeryShortDescription P_ShortDescription P_CatConcatKeyword

-- Jack Krupansky

On Thu, Jan 28, 2016 at 12:49 PM, Erick Erickson 
wrote:

> bq: if you are interested phrase query, you should use String field
>
> If you do this, you will NOT be able to search within the string. I.e.
> if the doc field is "my dog has fleas" you cannot match
> "dog has" with a string-based field.
>
> If you want to match the _entire_ string or you want prefix-only
> matching, then string might work, i.e. if you _only_ want to be able
> to match
>
> "my dog has fleas"
> "my dog*"
> but not
> "dog has fleas".
>
> On to the root question though.
>
> I really think you want to look at edismax. What you're trying to do
> is apply the same search term to individual fields. In particular,
> the pf parameter will automatically apply the search terms _as a phrase_
> against the field specified, relieving you of having to enclose things
> in quotes.
>
> The manual way of doing this would be to construct an elaborate query, like
> q=spp_keyword_exact:"dvd bracket" OR P_ShortDescription:(dvd bracket)
> OR
>
> NOTE: the parens are necessary or the last part of the above would be
> parsed as
> P_ShortDescription:dvd default_searchfield:bracket
>
> And the =query trick will show you exactly how things are actually
> searched, it's invaluable.
>
> Best,
> Erick
>
> On Thu, Jan 28, 2016 at 5:08 AM, Mugeesh Husain  wrote:
> > Hi,
> > if you are interested phrase query, you should use String field instead
> of
> > text field in schema like as
> >  
> >
> > this will solved you problem.
> >
> > if you are missing anything else let share
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/implement-exact-match-for-one-of-the-search-fields-only-tp4253786p4253827.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr partial date range search

2016-01-28 Thread vsriram30
Hi Benedetti Alessandro,

Thanks for your comments. In our application, Solr search is used in
multiple places. With respect to using a middle layer, our online requests
go through the search API (Middle layer) which is built on top of solr,
whereas the editorial tool, along with few other custom tools directly
contact Solr. 

Hence instead of implementing the partial date search in multiple frontends,
I have implemented a SearchComponent which would parse the incoming query,
identify data fields out of it, take the partial date value and construct
full UTC date out of it and send the query to search the underlying index.

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-partial-date-range-search-tp4253226p4253973.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: collection aliasing

2016-01-28 Thread Jeff Wartes
I enjoy using collection aliases in all client references, because that allows 
me to change the collection all clients use without updating the clients. I 
just move the alias. 
This is particularly useful if I’m doing a full index rebuild and want an 
atomic, zero-downtime switchover.





On 1/28/16, 6:07 AM, "Shawn Heisey"  wrote:

>On 1/28/2016 2:59 AM, vidya wrote:
>> Hi
>>
>> Then what is the difference between collection aliasing and shards parameter
>> mentioned in request handler of solrconfig.xml.
>>
>> In request handler of new collection's solrconfig.xml
>>shards =
>> http://localhost:8983/solr/collection1,http://localhost:8983/solr/collection1
>> I can query both data of collection1 and collection2 in new collection which
>> is same as collection aliasing.
>>
>> Is my understanding correct ? If so, then what is the special characteristic
>> of collection alaising. Please help me.
>
>Collection aliasing handles it completely automatically, no need to put
>a shards parameter *anywhere*.  That is the main difference.
>
>The shards parameter is the old way of doing distributed searches. 
>SolrCloud completely automates the process so that neither the admin nor
>the user has to worry about it.  Aliases are part of that automation.
>
>Thanks,
>Shawn
>


How to convert string field to date

2016-01-28 Thread Kallu, Sreenivasa (HQP)
Hi,
   I am new to solr.

I am using managed-schema. I am not using schema.xml.  I am indexing outlook 
email messages.
I can see only see three fields ( id,_version_,_text_) defined in 
managed-schema. Remaining fields are
handled by following dynamic field


I have field name attr_date with type string. I want convert this field type to 
date. Currently date range is not
working on this field. I tried schema API to add new field attr_date and got 
following error message
"Field 'attr_date' already exists".  I tried to replace field type to date and 
got following error message
"The field 'attr_date' is not present in this schema, and so cannot be 
replaced".

Please help me to convert "attr_date"  field type to date.

Advanced Thanks.
--sreenivasa kallu




Re: Solr partial date range search

2016-01-28 Thread vsriram30
Hi Jeyaprakash,

Thanks for your suggestions. Are you referring to Dataimporthandler
properties, to configure this in data-config.xml? Since I am currently
referring to partial date search in my online queries, I am not sure whether
this will help achieve that. Can you please explain bit more?

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-partial-date-range-search-tp4253226p4253974.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to convert string field to date

2016-01-28 Thread Steve Rowe
Hi Sreenivasa,

This is a known bug: https://issues.apache.org/jira/browse/SOLR-8607

(though the problem is not just about catch-all fields as the issue currently 
indicates - all dynamic fields are affected)

Two workarounds (neither tested):

1. Add attr_date via add-dynamic-field instead of add-field (even though the 
name has no asterisk)
2. Remove the attr_* dynamic field, add attr-date, then add attr_* back; these 
can be done with a single request.

I’ll update SOLR_8607 to reflect these things.

--
Steve
www.lucidworks.com

> On Jan 28, 2016, at 3:58 PM, Kallu, Sreenivasa (HQP) 
>  wrote:
> 
> Hi,
>   I am new to solr.
> 
> I am using managed-schema. I am not using schema.xml.  I am indexing outlook 
> email messages.
> I can see only see three fields ( id,_version_,_text_) defined in 
> managed-schema. Remaining fields are
> handled by following dynamic field
>  multiValued="true"/>
> 
> I have field name attr_date with type string. I want convert this field type 
> to date. Currently date range is not
> working on this field. I tried schema API to add new field attr_date and got 
> following error message
> "Field 'attr_date' already exists".  I tried to replace field type to date 
> and got following error message
> "The field 'attr_date' is not present in this schema, and so cannot be 
> replaced".
> 
> Please help me to convert "attr_date"  field type to date.
> 
> Advanced Thanks.
> --sreenivasa kallu
> 
> 



Re: implement exact match for one of the search fields only?

2016-01-28 Thread Derek Poh

Hi Emir

For the other search fields, if they have matches it should be return.

On 1/28/2016 8:17 PM, Emir Arnautovic wrote:

Hi Derek,
It is not clear what you are trying to achieve: "one of the search 
fields is an exact phrase match while the rest of the search fields 
can be exact or partial matches". What does "while" mean - it has to 
match in other fields as well or result should be scored better if it 
does but not mandatory to match?

For exact match you can use string type instead of text.
For querying multiple fields you can take a look at (e)dismax query 
parser.


Regards,
Emir




--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: implement exact match for one of the search fields only?

2016-01-28 Thread Derek Poh
Do you mean for the spp_keyword_exact field, I should use String field 
with keyword tokenised and lowercase token filtered?


On 1/28/2016 10:54 PM, Alessandro Benedetti wrote:

I think you are overthinking the problem :
I agre the described one is the most obvious solution in your case.
Only addition is to use a keyword tokenised field type, lowercase token
filtered if you want to be case in-sensitive .

Cheers

On 28 January 2016 at 13:08, Mugeesh Husain  wrote:


Hi,
if you are interested phrase query, you should use String field instead of
text field in schema like as
  

this will solved you problem.

if you are missing anything else let share



--
View this message in context:
http://lucene.472066.n3.nabble.com/implement-exact-match-for-one-of-the-search-fields-only-tp4253786p4253827.html
Sent from the Solr - User mailing list archive at Nabble.com.







--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

RE: How to convert string field to date

2016-01-28 Thread Kallu, Sreenivasa (HQP)
Thanks steve for prompt response.

I tried workaround one. 
i.e.  1. Add attr_date via add-dynamic-field instead of add-field (even though 
the name has no asterisk)

I am able to add dynamic field  attr_date. But while starting the solr , I am 
getting following message.
Could not load conf for core sreenimsg: Dynamic field name 'attr_date' should 
have either a leading or a trailing asterisk, and no others.

So solr looking for either leading * or trailing * in the dynamic field name.

I can see similar problems in workaround 2.

Any other suggestions?

Advanced Thanks.
--sreenivasa kallu

-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Thursday, January 28, 2016 1:17 PM
To: solr-user@lucene.apache.org
Subject: Re: How to convert string field to date

Hi Sreenivasa,

This is a known bug: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SOLR-2D8607=CwIFaQ=19TEyCb-E0do3cLmFgm9ItTXlbGQ5gmhRAlAtE256go=ZV-VnW_JFfcZo8vYJrpehzAvJFfw1xE42YRKpSHHqLg=ZJBCYIV-H5H3u5j_Rrhaex68Eb9dgqZmlO6fzKNfr8s=qmQIR8akquwcJ83E7HZgK38lTfSug8QifJEH1_ljJkk=
 

(though the problem is not just about catch-all fields as the issue currently 
indicates - all dynamic fields are affected)

Two workarounds (neither tested):

1. Add attr_date via add-dynamic-field instead of add-field (even though the 
name has no asterisk) 2. Remove the attr_* dynamic field, add attr-date, then 
add attr_* back; these can be done with a single request.

I’ll update SOLR_8607 to reflect these things.

--
Steve
www.lucidworks.com

> On Jan 28, 2016, at 3:58 PM, Kallu, Sreenivasa (HQP) 
>  wrote:
> 
> Hi,
>   I am new to solr.
> 
> I am using managed-schema. I am not using schema.xml.  I am indexing outlook 
> email messages.
> I can see only see three fields ( id,_version_,_text_) defined in 
> managed-schema. Remaining fields are handled by following dynamic 
> field  stored="true" multiValued="true"/>
> 
> I have field name attr_date with type string. I want convert this 
> field type to date. Currently date range is not working on this field. 
> I tried schema API to add new field attr_date and got following error 
> message "Field 'attr_date' already exists".  I tried to replace field type to 
> date and got following error message "The field 'attr_date' is not present in 
> this schema, and so cannot be replaced".
> 
> Please help me to convert "attr_date"  field type to date.
> 
> Advanced Thanks.
> --sreenivasa kallu
> 
> 



Re: implement exact match for one of the search fields only?

2016-01-28 Thread Emir Arnautovic

Hi Derek,
It is not clear what you are trying to achieve: "one of the search 
fields is an exact phrase match while the rest of the search fields can 
be exact or partial matches". What does "while" mean - it has to match 
in other fields as well or result should be scored better if it does but 
not mandatory to match?

For exact match you can use string type instead of text.
For querying multiple fields you can take a look at (e)dismax query parser.

Regards,
Emir

--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



On 28.01.2016 10:52, Derek Poh wrote:

Hi

First of all, sorry for the long post.

How do I implement or structured the query such that one of the search 
fields is an exact phrase match while the rest of the search fields 
can be exact or partial matches? Is this possible?


I have the following search fields
- P_VeryShortDescription
- P_ShortDescription
- P_CatConcatKeyword
- spp_keyword_exact

For the spp_keyword_exact field, I want to apply an exact match to it.

I have a document with the following information. If I search for 
'dvd', this document should not match. However if I search for 'dvd 
bracket', this document should match.

Right now when I search for 'dvd', it is not return, which is correct.
I want it to be return when I search for 'dvd bracket' but it is not.
I try enclosing it in double quotes "dvd bracket" but it is not 
return. Then again I can't enclosed the search terms in double quotes 
"dvd bracket" as those documents with the word 'dvd' and 'bracket' in 
the other fields will not be match, am I right?


doc:

Re: implement exact match for one of the search fields only?

2016-01-28 Thread Mugeesh Husain
Hi,
if you are interested phrase query, you should use String field instead of
text field in schema like as
 

this will solved you problem.

if you are missing anything else let share



--
View this message in context: 
http://lucene.472066.n3.nabble.com/implement-exact-match-for-one-of-the-search-fields-only-tp4253786p4253827.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unable to query the spellchecker in a distributed way

2016-01-28 Thread Damien Picard
(we use Solr 4.4)

2016-01-28 11:07 GMT+01:00 Damien Picard :

> Hi,
>
> We are using SolrCloud (4 nodes) and we have defined a suggester using the
> spellcheck component.
>
> The suggester is defined as :
>
> 
>   
> suggestOpeGes
>  name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
>  name="classname">org.apache.solr.spelling.suggest.Suggester
> ref_opegestion
> 0
> true
> true
>   
>   
> suggestRefCre
>  name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
>  name="classname">org.apache.solr.spelling.suggest.Suggester
> ref_cre
> 0
> true
> true
>   
>   
> suggestRefEcr
>  name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup
>  name="classname">org.apache.solr.spelling.suggest.Suggester
> ref_ecriture
> 0
> true
> true
>   
>   
>startup="lazy">
> 
> true
> suggestOpeGes
> 20
> true
> false
> 
> 
>   suggest
> 
>   
>
> When I query this collection suggest with the shards parameters :
> GET
> /solr/ppd_piste_audit_gsie_traite_001/suggest?q=GSIEBBA=json=true=true=suggestOpeGes=suggest/
>
> I get no results :
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":0}}
>
> But, when I disable the distributed search :
> GET
> /solr/ppd_piste_audit_gsie_traite_001/suggest?q=GSIEMMA=json=true=true=suggestOpeGes=false
>
> I get the results I expect :
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":28},
>   "spellcheck":{
> "suggestions":[
>   "GSIEBBA",{
> "numFound":20,
> "startOffset":0,
> "endOffset":7,
> "suggestion":["GSIEMMA44257700010010401",
>   "GSIEBBA64257700010013501",
>   "GSIEBBA70723503779040201",
>   "GSIEBBA71257700030012101",
>   "GSIEBBA71723503830023601",
>   "GSIEBBA74001300670011701",
>   "GSIEBBA74001300670011801",
>   "GSIEBBA74772000136021201",
>   "GSIEBBA76257700040010501",
>   "GSIEBBA76600101133030501",
>   "GSIEBBA76680400195030601",
>   "GSIEBBA77692100093024401",
>   "GSIEBBA77692100093024501",
>   "GSIEBBA78450700227020701",
>   "GSIEBBA78450700227020801",
>   "GSIEBBA78854102439020301",
>   "GSIEBBA78854102439020401",
>   "GSIEBBA79441700201040401",
>   "GSIEBBA79723504720012701",
>   "GSIEBBA79763600779010501"]},
>   "collation","GSIEBBA44257700010010401"]}}
>
> I also try to send a "manually" distributed search without success :
>
> GET
> /solr/ppd_piste_audit_gsie_traite_001-03_shard1_replica2/suggest?q=GSIEMMA=suggest=json=true=true=suggestOpeGes=suggest/=dn330003.xxx.priv:8983/solr/ppd_piste_audit_gsie_traite_001-03_shard2_replica1/|dn330004.xxx.priv:8983/solr/ppd_piste_audit_gsie_traite_001-03_shard1_replica1/
>
> What am I doing wrong ?
>
> Thank you.
> --
> Damien Picard
> Expert GWT
> 
> Mob : 06 11 51 47 78
>



-- 
Damien Picard
Expert GWT

Mob : 06 11 51 47 78


implement exact match for one of the search fields only?

2016-01-28 Thread Derek Poh

Hi

First of all, sorry for the long post.

How do I implement or structured the query such that one of the search 
fields is an exact phrase match while the rest of the search fields can 
be exact or partial matches? Is this possible?


I have the following search fields
- P_VeryShortDescription
- P_ShortDescription
- P_CatConcatKeyword
- spp_keyword_exact

For the spp_keyword_exact field, I want to apply an exact match to it.

I have a document with the following information. If I search for 'dvd', 
this document should not match. However if I search for 'dvd bracket', 
this document should match.

Right now when I search for 'dvd', it is not return, which is correct.
I want it to be return when I search for 'dvd bracket' but it is not.
I try enclosing it in double quotes "dvd bracket" but it is not return. 
Then again I can't enclosed the search terms in double quotes "dvd 
bracket" as those documents with the word 'dvd' and 'bracket' in the 
other fields will not be match, am I right?


doc:

enum vs string performance

2016-01-28 Thread Prateek Jain J

Hi,

We have some fixed string constants in our application like eventType, 
sourceEvent etc. We don't have requirement of partial/wildcard search on these 
fields. Will there be any performance gain while inserting or querying if, we 
take their fieldType as EnumField in solr?


Regards,
Prateek Jain
Team: Totoro
Mobile: +353 894 391716



Re: collection aliasing

2016-01-28 Thread vidya
Hi

Then what is the difference between collection aliasing and shards parameter
mentioned in request handler of solrconfig.xml.

In request handler of new collection's solrconfig.xml
   shards =
http://localhost:8983/solr/collection1,http://localhost:8983/solr/collection1
I can query both data of collection1 and collection2 in new collection which
is same as collection aliasing.

Is my understanding correct ? If so, then what is the special characteristic
of collection alaising. Please help me.

Thanks in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/collection-aliasing-tp4252527p4253787.html
Sent from the Solr - User mailing list archive at Nabble.com.


Apache solr can be made near-real-Time???

2016-01-28 Thread Samina
I want to use solr for enterprise level search on a large scale of data in
TB, where in  Lakh's of data will be update in an hour and approx 3 Lakh's
of data of would be seached in one hour.This is just the rough value though
nearby,so how can we achieve near -real-Time search in solr ? and how much
percent of real time search would be possible on this large data? 
Can we even achieve this doing indexing at certain
interval(automatic/Manual)?
Please help and suggest



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Apache-solr-can-be-made-near-real-Time-tp4253808.html
Sent from the Solr - User mailing list archive at Nabble.com.


Unable to query the spellchecker in a distributed way

2016-01-28 Thread Damien Picard
Hi,

We are using SolrCloud (4 nodes) and we have defined a suggester using the
spellcheck component.

The suggester is defined as :


  
suggestOpeGes
org.apache.solr.spelling.suggest.tst.TSTLookup
org.apache.solr.spelling.suggest.Suggester
ref_opegestion
0
true
true
  
  
suggestRefCre
org.apache.solr.spelling.suggest.tst.TSTLookup
org.apache.solr.spelling.suggest.Suggester
ref_cre
0
true
true
  
  
suggestRefEcr
org.apache.solr.spelling.suggest.tst.TSTLookup
org.apache.solr.spelling.suggest.Suggester
ref_ecriture
0
true
true
  
  
  

true
suggestOpeGes
20
true
false


  suggest

  

When I query this collection suggest with the shards parameters :
GET
/solr/ppd_piste_audit_gsie_traite_001/suggest?q=GSIEBBA=json=true=true=suggestOpeGes=suggest/

I get no results :

{
  "responseHeader":{
"status":0,
"QTime":0}}

But, when I disable the distributed search :
GET
/solr/ppd_piste_audit_gsie_traite_001/suggest?q=GSIEMMA=json=true=true=suggestOpeGes=false

I get the results I expect :

{
  "responseHeader":{
"status":0,
"QTime":28},
  "spellcheck":{
"suggestions":[
  "GSIEBBA",{
"numFound":20,
"startOffset":0,
"endOffset":7,
"suggestion":["GSIEMMA44257700010010401",
  "GSIEBBA64257700010013501",
  "GSIEBBA70723503779040201",
  "GSIEBBA71257700030012101",
  "GSIEBBA71723503830023601",
  "GSIEBBA74001300670011701",
  "GSIEBBA74001300670011801",
  "GSIEBBA74772000136021201",
  "GSIEBBA76257700040010501",
  "GSIEBBA76600101133030501",
  "GSIEBBA76680400195030601",
  "GSIEBBA77692100093024401",
  "GSIEBBA77692100093024501",
  "GSIEBBA78450700227020701",
  "GSIEBBA78450700227020801",
  "GSIEBBA78854102439020301",
  "GSIEBBA78854102439020401",
  "GSIEBBA79441700201040401",
  "GSIEBBA79723504720012701",
  "GSIEBBA79763600779010501"]},
  "collation","GSIEBBA44257700010010401"]}}

I also try to send a "manually" distributed search without success :

GET
/solr/ppd_piste_audit_gsie_traite_001-03_shard1_replica2/suggest?q=GSIEMMA=suggest=json=true=true=suggestOpeGes=suggest/=dn330003.xxx.priv:8983/solr/ppd_piste_audit_gsie_traite_001-03_shard2_replica1/|dn330004.xxx.priv:8983/solr/ppd_piste_audit_gsie_traite_001-03_shard1_replica1/

What am I doing wrong ?

Thank you.
-- 
Damien Picard
Expert GWT

Mob : 06 11 51 47 78


Re: Apache solr can be made near-real-Time???

2016-01-28 Thread Emir Arnautovic

Hi Samina,
First to thank you for teaching me what "lakh" is :)

Solr is capable of handling large amount of data, but that requires 
large Solr cluster. What you need to determine is what is your real time 
- what is max time you can tolerate update to be visible; and determine 
acceptable query latency. After that you need to test with different 
shard size to achieve target latency. After that you can extrapolate it 
to your full data set and see how many shards you need.

What you can do with your data to reduce hw requirements:
* remove from index anything that is not needed
* in case you have time related data you can use time slicing
* in case of multi tenant index you can use routing

Regards,
Emir

On 28.01.2016 12:20, Samina wrote:

I want to use solr for enterprise level search on a large scale of data in
TB, where in  Lakh's of data will be update in an hour and approx 3 Lakh's
of data of would be seached in one hour.This is just the rough value though
nearby,so how can we achieve near -real-Time search in solr ? and how much
percent of real time search would be possible on this large data?
Can we even achieve this doing indexing at certain
interval(automatic/Manual)?
Please help and suggest



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Apache-solr-can-be-made-near-real-Time-tp4253808.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



upgrade SolrCloud

2016-01-28 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I'm planning to upgrade (from 5.4.0 to 5.4.1) a SolrCloud with two replicas 
(one shard).

Am I correct in thinking I should be able simply to shutdown one node, change 
it to using 5.4.1, restart the upgraded node, shutdown the other node and 
upgrade it? Or are there caveats to consider?


Re: Mysql data import issue

2016-01-28 Thread Gora Mohanty
On 29 January 2016 at 04:13, vsriram30  wrote:

> Hi,
> I am using Solr 4.6.1 and I am trying to import my data from mysql to solr.
>
> In mysql, I have a table with columns,
> id, legacyid, otherfields...
>
[...]

> But still I get mysql id field to solr id field mapping. Please let me know
> how to prevent this from happening.
>

How about if you do not select the mysql "id" field in the query attribute
for the entity?

Regards,
Gora


Re: enum vs string performance

2016-01-28 Thread Alessandro Benedetti
Taking a look to the documentation the EnumField type seems more related
the possibility of configuring a custom default sort order instead of the
lexographic one.
When you say "any performance gain" what do you mean ?
performance gain accomplishing which tasks ?
Space/time performance gain ?

At index time you will need to store the same binary information, pretty
sure the compression algorithm used will be the same of normal Strings.
At query time, you produced an inverted index so the behaviour will be
pretty much the same.
What are the kind of benefits you want than the string field can not
provide ?
Which are the most common operation you carry out ?

Cheers


On 28 January 2016 at 10:00, Prateek Jain J 
wrote:

>
> Hi,
>
> We have some fixed string constants in our application like eventType,
> sourceEvent etc. We don't have requirement of partial/wildcard search on
> these fields. Will there be any performance gain while inserting or
> querying if, we take their fieldType as EnumField in solr?
>
>
> Regards,
> Prateek Jain
> Team: Totoro
> Mobile: +353 894 391716
>
>


-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Adding new documents to the search results and rescoring. Is it possible?

2016-01-28 Thread vitaly bulgakov
I have Solr 4.2. Is it possible to rescore results after adding new documents
to the result set?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-new-documents-to-the-search-results-and-rescoring-Is-it-possible-tp4253859.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: implement exact match for one of the search fields only?

2016-01-28 Thread Alessandro Benedetti
I think you are overthinking the problem :
I agre the described one is the most obvious solution in your case.
Only addition is to use a keyword tokenised field type, lowercase token
filtered if you want to be case in-sensitive .

Cheers

On 28 January 2016 at 13:08, Mugeesh Husain  wrote:

> Hi,
> if you are interested phrase query, you should use String field instead of
> text field in schema like as
>  
>
> this will solved you problem.
>
> if you are missing anything else let share
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/implement-exact-match-for-one-of-the-search-fields-only-tp4253786p4253827.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: How to convert string field to date

2016-01-28 Thread Steve Rowe
Try workaround 2, I did and it worked for me.  See my comment on the issue: 


--
Steve
www.lucidworks.com

> On Jan 28, 2016, at 6:45 PM, Kallu, Sreenivasa (HQP) 
>  wrote:
> 
> Thanks steve for prompt response.
> 
> I tried workaround one. 
> i.e.  1. Add attr_date via add-dynamic-field instead of add-field (even 
> though the name has no asterisk)
> 
> I am able to add dynamic field  attr_date. But while starting the solr , I am 
> getting following message.
> Could not load conf for core sreenimsg: Dynamic field name 'attr_date' should 
> have either a leading or a trailing asterisk, and no others.
> 
> So solr looking for either leading * or trailing * in the dynamic field name.
> 
> I can see similar problems in workaround 2.
> 
> Any other suggestions?
> 
> Advanced Thanks.
> --sreenivasa kallu
> 
> -Original Message-
> From: Steve Rowe [mailto:sar...@gmail.com] 
> Sent: Thursday, January 28, 2016 1:17 PM
> To: solr-user@lucene.apache.org
> Subject: Re: How to convert string field to date
> 
> Hi Sreenivasa,
> 
> This is a known bug: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SOLR-2D8607=CwIFaQ=19TEyCb-E0do3cLmFgm9ItTXlbGQ5gmhRAlAtE256go=ZV-VnW_JFfcZo8vYJrpehzAvJFfw1xE42YRKpSHHqLg=ZJBCYIV-H5H3u5j_Rrhaex68Eb9dgqZmlO6fzKNfr8s=qmQIR8akquwcJ83E7HZgK38lTfSug8QifJEH1_ljJkk=
>  
> 
> (though the problem is not just about catch-all fields as the issue currently 
> indicates - all dynamic fields are affected)
> 
> Two workarounds (neither tested):
> 
> 1. Add attr_date via add-dynamic-field instead of add-field (even though the 
> name has no asterisk) 2. Remove the attr_* dynamic field, add attr-date, then 
> add attr_* back; these can be done with a single request.
> 
> I’ll update SOLR_8607 to reflect these things.
> 
> --
> Steve
> www.lucidworks.com
> 
>> On Jan 28, 2016, at 3:58 PM, Kallu, Sreenivasa (HQP) 
>>  wrote:
>> 
>> Hi,
>>  I am new to solr.
>> 
>> I am using managed-schema. I am not using schema.xml.  I am indexing outlook 
>> email messages.
>> I can see only see three fields ( id,_version_,_text_) defined in 
>> managed-schema. Remaining fields are handled by following dynamic 
>> field > stored="true" multiValued="true"/>
>> 
>> I have field name attr_date with type string. I want convert this 
>> field type to date. Currently date range is not working on this field. 
>> I tried schema API to add new field attr_date and got following error 
>> message "Field 'attr_date' already exists".  I tried to replace field type 
>> to date and got following error message "The field 'attr_date' is not 
>> present in this schema, and so cannot be replaced".
>> 
>> Please help me to convert "attr_date"  field type to date.
>> 
>> Advanced Thanks.
>> --sreenivasa kallu
>> 
>> 
> 



Re: implement exact match for one of the search fields only?

2016-01-28 Thread Derek Poh

Hi Erick and all

Yes I am trying to apply the same search term to all the 4 search 
fieldsand 1 of the search field must be an exact match.


You mentioned "In particular, the pf parameter will automatically apply 
the search terms _as a phrase_ against the field specified, relieving 
you of having to enclose things in quotes."

I triedbut it is not returning the document.

http://hkenedcdg1.globalsources.com:8983/solr/product/select?q=dvd%20bracket=spp_keyword_exact=edismax=query=spp_keyword_exact=P_ProductId,spp_keyword_exact,P_SPPKW

I may have misunderstood.


On 1/29/2016 1:49 AM, Erick Erickson wrote:

bq: if you are interested phrase query, you should use String field

If you do this, you will NOT be able to search within the string. I.e.
if the doc field is "my dog has fleas" you cannot match
"dog has" with a string-based field.

If you want to match the _entire_ string or you want prefix-only
matching, then string might work, i.e. if you _only_ want to be able
to match

"my dog has fleas"
"my dog*"
but not
"dog has fleas".

On to the root question though.

I really think you want to look at edismax. What you're trying to do
is apply the same search term to individual fields. In particular,
the pf parameter will automatically apply the search terms _as a phrase_
against the field specified, relieving you of having to enclose things
in quotes.

The manual way of doing this would be to construct an elaborate query, like
q=spp_keyword_exact:"dvd bracket" OR P_ShortDescription:(dvd bracket) OR

NOTE: the parens are necessary or the last part of the above would be
parsed as
P_ShortDescription:dvd default_searchfield:bracket

And the =query trick will show you exactly how things are actually
searched, it's invaluable.

Best,
Erick

On Thu, Jan 28, 2016 at 5:08 AM, Mugeesh Husain  wrote:

Hi,
if you are interested phrase query, you should use String field instead of
text field in schema like as
  

this will solved you problem.

if you are missing anything else let share



--
View this message in context: 
http://lucene.472066.n3.nabble.com/implement-exact-match-for-one-of-the-search-fields-only-tp4253786p4253827.html
Sent from the Solr - User mailing list archive at Nabble.com.





--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: Is Solr cache cleared when I restart Solr?

2016-01-28 Thread Alessandro Benedetti
As already specified you need to distinguish between Solr Cache and OS
Memory mapped files.
What you should clearly notice in your situation is an increase of space
for the OS Memory mapped files.
Which means faster access to index segments ( almost all the different data
structures are memory mapped ).

Related the internal Solr Cache, if you haven't changed the java memory
properties, I would expect not that much difference.
If you were on a virtual machine sharing physical memory with other virtual
instances, you could find benefits on the other hand.

Cheers


On 28 January 2016 at 05:44, Zheng Lin Edwin Yeo 
wrote:

> Thanks Erick and Shawn for your reply.
>
> We have recently upgraded the server RAM from 64MB to 192MB, and I noticed
> that this caching occurs after we upgraded the RAM. Previously, the cache
> may not even be preserved in the same Solr session.
> So is it true that the upgrading of the server RAM creates enough spare
> memory for good caching?
>
> Regards,
> Edwin
>
>
> On 28 January 2016 at 12:27, Shawn Heisey  wrote:
>
> > On 1/27/2016 8:11 PM, Zheng Lin Edwin Yeo wrote:
> > > I would like to find out, is the cache in the Solr cleared when I shut
> > down
> > > the Solr instant and restart it?
> > >
> > > I am suspecting that the cache is not entirely cleared, because when I
> > try
> > > to do a search on the same query as I did before the search, it still
> > has a
> > > return QTime that is much faster than the initial search. However,
> when I
> > > do a search on a new query, the return QTime is the original speed.
> > >
> > > I am using Solr 5.4.0, and this is my setting for the queryResultCache.
> > >
> > >  > >  size="1024"
> > >  initialSize="512"
> > >  autowarmCount="0"/>
> >
> > The Solr caches are maintained in the Java heap, which is lost when Java
> > stops.
> >
> > Although the Solr caches are not preserved across a restart, the
> > operating system does cache actual index data in main memory, so when
> > Solr asks for the same index data off of the disk again, it is pulled
> > directly from RAM, which is a LOT faster than the disk.  In order to
> > deliver good performance, Solr is extremely reliant on this built-in
> > feature of all modern operating systems, so there must be enough spare
> > memory for good caching.
> >
> > Here are a couple of pages with some more detail:
> >
> > https://wiki.apache.org/solr/SolrPerformanceProblems
> > https://en.wikipedia.org/wiki/Page_cache
> >
> > Thanks,
> > Shawn
> >
> >
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: collection aliasing

2016-01-28 Thread Shawn Heisey
On 1/28/2016 2:59 AM, vidya wrote:
> Hi
>
> Then what is the difference between collection aliasing and shards parameter
> mentioned in request handler of solrconfig.xml.
>
> In request handler of new collection's solrconfig.xml
>shards =
> http://localhost:8983/solr/collection1,http://localhost:8983/solr/collection1
> I can query both data of collection1 and collection2 in new collection which
> is same as collection aliasing.
>
> Is my understanding correct ? If so, then what is the special characteristic
> of collection alaising. Please help me.

Collection aliasing handles it completely automatically, no need to put
a shards parameter *anywhere*.  That is the main difference.

The shards parameter is the old way of doing distributed searches. 
SolrCloud completely automates the process so that neither the admin nor
the user has to worry about it.  Aliases are part of that automation.

Thanks,
Shawn



Re: Solrcloud error on finding active nodes.

2016-01-28 Thread Shawn Heisey
On 1/28/2016 12:15 AM, Pranaya Behera wrote:
> I have checked in the admin UI and now I have created 3 shards 2
> replicas for each shard and 1 shard per node. This is what I get:
>
> {"card":{ "replicationFactor":"2", "router":{"name":"compositeId"},
> "maxShardsPerNode":"1", "autoAddReplicas":"false", "shards":{
> "shard1":{ "range":"8000-d554", "state":"active",
> "replicas":{}}, "shard2":{ "range":"d555-2aa9",
> "state":"active", "replicas":{}}, "shard3":{
> "range":"2aaa-7fff", "state":"active", "replicas":{} There
> is no replica. How is this possible? This is what I used to create the
> collection: curl
> "http://localhost:8983/solr/admin/collections?action=CREATE=card=3=2=1=localhost:8983_solr,localhost:8984_solr,localhost:8985_solr=true=igp;
>

With numShards=3 and replicationFactor=2, Solr will need to create six
cores to satisfy the CREATE.  With the further restrictions of
maxShardsPerNode=1 and a list of three nodes to use, this request is
impossible to fulfill.

If this request appears to succeed but does not create a usable
clusterstate, then there may be a bug in the detection of impossible
CREATE requests.  The request *should* have failed entirely, because it
is not possible.

I noticed that the parameters in the first message on this thread
indicated maxShardsPerNode=2, so *that* request should have worked
correctly.

For each test that you are trying, you need to include what you tried,
what you expected, and what actually happened.  The info about what
actually happened needs to be non-ambiguous.  Screenshots and actual log
entries are a good way to make sure the information is concrete and
accurate.  Don't send these things as attachments -- put them on
appropriate sharing sites (gist, dropbox, etc.) and include URLs.

https://wiki.apache.org/solr/UsingMailingLists

Thanks,
Shawn



Re: implement exact match for one of the search fields only?

2016-01-28 Thread Alessandro Benedetti
Jan,
I admit I took a brief look, but what are the benefit of using your
strategy instead of an additional not tokenised ( keywordTokenized) copy
field ?

Cheers

On 28 January 2016 at 15:22, Jan Høydahl  wrote:

> Hi
>
> Please look at my github repo with a template for a field type allowing
> exact match. Typical use is with disMax query parser and the “pf” param.
> See https://github.com/cominvent/exactmatch
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 28. jan. 2016 kl. 10.52 skrev Derek Poh :
> >
> > Hi
> >
> > First of all, sorry for the long post.
> >
> > How do I implement or structured the query such that one of the search
> fields is an exact phrase match while the rest of the search fields can be
> exact or partial matches? Is this possible?
> >
> > I have the following search fields
> > - P_VeryShortDescription
> > - P_ShortDescription
> > - P_CatConcatKeyword
> > - spp_keyword_exact
> >
> > For the spp_keyword_exact field, I want to apply an exact match to it.
> >
> > I have a document with the following information. If I search for 'dvd',
> this document should not match. However if I search for 'dvd bracket', this
> document should match.
> > Right now when I search for 'dvd', it is not return, which is correct.
> > I want it to be return when I search for 'dvd bracket' but it is not.
> > I try enclosing it in double quotes "dvd bracket" but it is not return.
> Then again I can't enclosed the search terms in double quotes "dvd bracket"
> as those documents with the word 'dvd' and 'bracket' in the other fields
> will not be match, am I right?
> >
> > doc:
> > 

Re: Solr cannot return result when query with # * like title:#7654321*

2016-01-28 Thread Ahmet Arslan
Hi Diyun,

Have you read 


https://lucidworks.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/
 
and
https://wiki.apache.org/solr/MultitermQueryAnalysis
?

Ahmet


On Thursday, January 28, 2016 9:02 AM, diyun2008  wrote:
Hi Shawn

Your information is very important. It can explain the phenomena I met. 
Do you know from where  I can get the related document Or subject about
what you said?
I want to have a deep understand to this.
  
Thank you Very much!

Diyun





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-cannot-return-result-when-query-with-like-title-7654321-tp4253541p4253769.html
Sent from the Solr - User mailing list archive at Nabble.com.


How much JVM should we allocate

2016-01-28 Thread Midas A
Hi ,

CPU : 4
physical memory : 48 GB


and we are only have solr on this server . How much JVM  can be allocate to
run server smoothly.

Regards,
Abhishek Tiwari


Solr+HDFS

2016-01-28 Thread Joseph Obernberger
Hi All - we're using Apache Solr Cloud 5.2.1, with an HDFS system that is
86% full.  Some of the datanodes in the HDFS cluster are more close to
being full than other nodes.  We're getting messages about "Error adding
log" from the index process, which I **think** is related to datanodes
being full.
Is that the case?  Even though HDFS still has room available?

Upon restart of Solr Cloud, we see messages such as:

ERROR - 2016-01-28 22:15:27.594; [   UNCLASS]
org.apache.solr.common.SolrException; Failure to open existing log file
(non fatal)
hdfs://nameservice1:8020/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0282501:org.apache.solr.common.SolrException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
failed to create file
/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0282501 for
DFSClient_NONMAPREDUCE_1371064795_29 for client 172.16.100.218 because
current leaseholder is trying to recreate file.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149)

at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611)

at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124)

at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)

at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at
org.apache.solr.update.HdfsTransactionLog.(HdfsTransactionLog.java:131)

at
org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:193)
at
org.apache.solr.update.UpdateHandler.(UpdateHandler.java:136)
at
org.apache.solr.update.UpdateHandler.(UpdateHandler.java:94)
at
org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:99)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:573)
at
org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:635)
at
org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:928)
at org.apache.solr.core.SolrCore.(SolrCore.java:786)
at org.apache.solr.core.SolrCore.(SolrCore.java:658)
at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:637)
at
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:381)
at
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:375)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:148)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException):
failed to create file
/solr5.2/UNCLASS/core_node14/data/tlog/tlog.0282501 for
DFSClient_NONMAPREDUCE_1371064795_29 for client 172.16.100.218 because
current leaseholder is trying to recreate file.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3075)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)

at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)

at