Re: Distributing nodes with the collections API RESTORE command

2016-09-16 Thread Hrishikesh Gadre
Hi Stephen,

Thanks for the update. I filed SOLR-9527
 for tracking purpose. I
will take a look and get back to you.

Thanks
Hrishikesh

On Fri, Sep 16, 2016 at 2:56 PM, Stephen Lewis  wrote:

> Hello,
>
> I've tried this on both solr 6.1 and 6.2, with the same result. You are
> right that the collections API offering collection level backup/restore
> from remote server is a new feature.
>
> After some more experimentation, I am fairly sure that this is a bug which
> is specific to the leaders in backup restore. After I ran a command to
> restore a backup of the collection "foo" (which has maxShardsPerNode set to
> 1 as well) with a replication factor of 2, I see consistently that the
> followers (replica > 1) are correctly distributed, but all of the leaders
> are brought up hosted on one shard.
>
> *Repro*
>
> *create *
> http://solr.test:8983/solr/admin/collections?action=
> CREATE=foo=3=1
> configName=test-one
> (after creation, all shards are on different nodes as expected)
>
> *backup*
> http://solr.test:8983/solr/admin/collections?action=
> BACKUP=foo-2=foo=foo-2
>
> *delete*
> http://solr.test:8983/solr/admin/collections?action=DELETE=foo
>
> *restore*
> Result: All leaders are hosted on node, followers are spread about.
>
>  {
>   "responseHeader" : { "status" : 0,"QTime" : 7},
>   "cluster" : {
> "collections" : {
>   "foo" : {
> "replicationFactor" : "2",
> "shards" : {
>   "shard2" : {
> "range" : "d555-2aa9",
> "state" : "active",
> "replicas" : {
>   "core_node1" : {
> "core" : "foo_shard2_replica0",
> "base_url" : "http://IP1:8983/solr;,
> "node_name" : "IP1:8983_solr",
> "state" : "active",
> "leader" : "true"
>   },
>   "core_node4" : {
> "core" : "foo_shard2_replica1",
> "base_url" : "http://IP2:8983/solr;,
> "node_name" : "IP2:8983_solr",
> "state" : "recovering"
>   }
> }
>   },
>   "shard3" : {
> "range" : "2aaa-7fff",
> "state" : "active",
> "replicas" : {
>   "core_node2" : {
> "core" : "foo_shard3_replica0",
> "base_url" : "http://IP1:8983/solr;,
> "node_name" : "IP1:8983_solr",
> "state" : "active",
> "leader" : "true"
>   },
>   "core_node5" : {
> "core" : "foo_shard3_replica1",
> "base_url" : "http://IP3:8983/solr;,
> "node_name" : "IP3:8983_solr",
> "state" : "recovering"
>   }
> }
>   },
>   "shard1" : {
> "range" : "8000-d554",
> "state" : "active",
> "replicas" : {
>   "core_node3" : {
> "core" : "foo_shard1_replica0",
> "base_url" : "http://IP1:8983/solr;,
> "node_name" : "IP1:8983_solr",
> "state" : "active",
> "leader" : "true"
>   },
>   "core_node6" : {
> "core" : "foo_shard1_replica1",
> "base_url" : "http://IP4:8983/solr;,
> "node_name" : "IP4:8983_solr",
> "state" : "recovering"
>   }
> }
>   }
> },
> "router" : {
>   "name" : "compositeId"
> },
> "maxShardsPerNode" : "1",
> "autoAddReplicas" : "false",
> "znodeVersion" : 204,
> "configName" : "test-one"
>   }
> },
> "properties" : {
>   "location" : "/mnt/solr_backups"
> },
> "live_nodes" : [
>   "IP5:8983_solr",
>   "IP3:8983_solr",
>   "IP6:8983_solr",
>   "IP4:8983_solr",
>   "IP7:8983_solr",
>   "IP1:8983_solr",
>   "IP8:8983_solr",
>   "IP9:8983_solr",
>   "IP2:8983_solr"]
>   }
> }
>
>
> On Fri, Sep 16, 2016 at 2:07 PM, Reth RM  wrote:
>
> > Which version of solr? Afaik, until 6.1, solr backup and restore command
> > apis required to do separate backup for each shard, and then restore in
> > similar lines( both go for each). 6.1 version seems to have new feature
> of
> > backing up entire collection records and then restoring it back to new
> > collection setup(did not try yet).
> >
> >
> > On Thu, Sep 15, 2016 at 1:45 PM, Stephen Lewis 
> wrote:
> >
> > > Hello,
> > >
> > > I have a solr cloud cluster in a test environment running 6.1 where I
> am
> > > looking at using the collections API BACKUP and RESTORE commands to
> > manage
> > > data integrity.
> > >
> > > When restoring from a backup, I'm finding the same behavior occurs
> every
> > > time; after the restore command, 

Re: Tutorial not working for me

2016-09-16 Thread Chris Hostetter

: I apologize if this is a really stupid question. I followed all

It's not a stupid question, the tutorial is completley broken -- and for 
that matter, in my opinion, the data_driven_schema_configs used by that 
tutorial (and recommended for new users) are largely useless for the same 
underlying reason...

https://issues.apache.org/jira/browse/SOLR-9526

Thank you very much for asking about this - hopefully the folks who 
understand this more (and don't share my opinion that the entire concept 
of data_driven schemas are a terrible idea) can chime in and explain WTF 
is going on here)


-Hoss
http://www.lucidworks.com/


Re: Solr Cloud Using Docker

2016-09-16 Thread Vincenzo D'Amore
Hi,

I did this https://github.com/freedev/solrcloud-zookeeper-docker
If you want give it a try, this project aims to help developers and newbies
that would try Solr Cloud and Zookeeper in a Docker environment.




On Fri, Sep 16, 2016 at 11:11 PM, John Bickerstaff  wrote:

> In case this is helpful - sponsored by Lucidworks
>
> https://hub.docker.com/_/solr/
>
> I can't speak to the pros and cons of using it in production except to say
> that they are probably the same as the pros and cons of running Docker in
> production (as in nothing is perfect)
>
> HTH...
>
> On Fri, Sep 16, 2016 at 2:37 PM, Brendan Grainger 
> wrote:
>
> > Hi,
> >
> > Does anyone used docker for deploying solr? I am using it for running a
> > single solr server ‘cloud’ locally on my dev box, but wondering about the
> > pros/cons of using it in production.
> >
> > Thanks,
> > Brendan
>



-- 
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251


Re: help with field definition

2016-09-16 Thread Gandham, Satya
Great, that worked. Thanks Ray and Emir for the solutions.



On 9/16/16, 3:49 PM, "Ray Niu"  wrote:

Just add q.op=OR to change default operator to OR and it should work

2016-09-16 12:44 GMT-07:00 Gandham, Satya :

> Hi Emir,
>
>Thanks for your reply. But I’m afraid I’m not seeing the
> expected response. I’ve included the query and the corresponding debug
> portion of the response:
>
> select?q=Justin\ Beiber=exactName_noAlias_en_US
>
>  Debug:
>
> "rawquerystring":"Justin\\ Beiber",
> "querystring":"Justin\\ Beiber",
> "parsedquery":"+((exactName_noAlias_en_US:justin
> exactName_noAlias_en_US:justin beiber)/no_coord) +exactName_noAlias_en_US:
> beiber",
> "parsedquery_toString":"+(exactName_noAlias_en_US:justin
> exactName_noAlias_en_US:justin beiber) +exactName_noAlias_en_US:beiber",
> "explain":{},
>
>
> Satya.
>
> On 9/16/16, 2:46 AM, "Emir Arnautovic" 
> wrote:
>
> Hi,
>
> I missed that you already did define field and you are having troubles
> with query (did not read stackoverflow). Added answer there, but just
> in
> case somebody else is having similar troubles, issue is how query is
> written - space has to be escaped:
>
>q=Justin\ Bieber
>
> Regards,
> Emir
>
> On 13.09.2016 23:27, Gandham, Satya wrote:
> > HI,
> >
> >I need help with defining a field ‘singerName’ with the
> right tokenizers and filters such that it gives me the below described
> behavior:
> >
> > I have a few documents as given below:
> >
> > Doc 1
> >singerName: Justin Beiber
> > Doc 2:
> >singerName: Justin Timberlake
> > …
> >
> >
> > Below is the list of quries and the corresponding matches:
> >
> > Query 1: “My fav artist Justin Beiber is very impressive”
> > Docs Matched : Doc1
> >
> > Query 2: “I have a Justin Timberlake poster on my wall”
> > Docs Matched: Doc2
> >
> > Query 3: “The name Bieber Justin is unique”
> > Docs Matched: None
> >
> > Query 4: “Timberlake is a lake of timber..?”
> > Docs Matched: None.
> >
> > I have this described a bit more detailed here:
> http://stackoverflow.com/questions/39399321/solr-shingle-query-matching-
> keyword-tokenized-field
> >
> > I’d appreciate any help in addressing this problem.
> >
> > Thanks !!
> >
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
>
>




Re: help with field definition

2016-09-16 Thread Ray Niu
Just add q.op=OR to change default operator to OR and it should work

2016-09-16 12:44 GMT-07:00 Gandham, Satya :

> Hi Emir,
>
>Thanks for your reply. But I’m afraid I’m not seeing the
> expected response. I’ve included the query and the corresponding debug
> portion of the response:
>
> select?q=Justin\ Beiber=exactName_noAlias_en_US
>
>  Debug:
>
> "rawquerystring":"Justin\\ Beiber",
> "querystring":"Justin\\ Beiber",
> "parsedquery":"+((exactName_noAlias_en_US:justin
> exactName_noAlias_en_US:justin beiber)/no_coord) +exactName_noAlias_en_US:
> beiber",
> "parsedquery_toString":"+(exactName_noAlias_en_US:justin
> exactName_noAlias_en_US:justin beiber) +exactName_noAlias_en_US:beiber",
> "explain":{},
>
>
> Satya.
>
> On 9/16/16, 2:46 AM, "Emir Arnautovic" 
> wrote:
>
> Hi,
>
> I missed that you already did define field and you are having troubles
> with query (did not read stackoverflow). Added answer there, but just
> in
> case somebody else is having similar troubles, issue is how query is
> written - space has to be escaped:
>
>q=Justin\ Bieber
>
> Regards,
> Emir
>
> On 13.09.2016 23:27, Gandham, Satya wrote:
> > HI,
> >
> >I need help with defining a field ‘singerName’ with the
> right tokenizers and filters such that it gives me the below described
> behavior:
> >
> > I have a few documents as given below:
> >
> > Doc 1
> >singerName: Justin Beiber
> > Doc 2:
> >singerName: Justin Timberlake
> > …
> >
> >
> > Below is the list of quries and the corresponding matches:
> >
> > Query 1: “My fav artist Justin Beiber is very impressive”
> > Docs Matched : Doc1
> >
> > Query 2: “I have a Justin Timberlake poster on my wall”
> > Docs Matched: Doc2
> >
> > Query 3: “The name Bieber Justin is unique”
> > Docs Matched: None
> >
> > Query 4: “Timberlake is a lake of timber..?”
> > Docs Matched: None.
> >
> > I have this described a bit more detailed here:
> http://stackoverflow.com/questions/39399321/solr-shingle-query-matching-
> keyword-tokenized-field
> >
> > I’d appreciate any help in addressing this problem.
> >
> > Thanks !!
> >
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
>
>


Re: Distributing nodes with the collections API RESTORE command

2016-09-16 Thread Stephen Lewis
Hello,

I've tried this on both solr 6.1 and 6.2, with the same result. You are
right that the collections API offering collection level backup/restore
from remote server is a new feature.

After some more experimentation, I am fairly sure that this is a bug which
is specific to the leaders in backup restore. After I ran a command to
restore a backup of the collection "foo" (which has maxShardsPerNode set to
1 as well) with a replication factor of 2, I see consistently that the
followers (replica > 1) are correctly distributed, but all of the leaders
are brought up hosted on one shard.

*Repro*

*create *
http://solr.test:8983/solr/admin/collections?action=
CREATE=foo=3=1
configName=test-one
(after creation, all shards are on different nodes as expected)

*backup*
http://solr.test:8983/solr/admin/collections?action=
BACKUP=foo-2=foo=foo-2

*delete*
http://solr.test:8983/solr/admin/collections?action=DELETE=foo

*restore*
Result: All leaders are hosted on node, followers are spread about.

 {
  "responseHeader" : { "status" : 0,"QTime" : 7},
  "cluster" : {
"collections" : {
  "foo" : {
"replicationFactor" : "2",
"shards" : {
  "shard2" : {
"range" : "d555-2aa9",
"state" : "active",
"replicas" : {
  "core_node1" : {
"core" : "foo_shard2_replica0",
"base_url" : "http://IP1:8983/solr;,
"node_name" : "IP1:8983_solr",
"state" : "active",
"leader" : "true"
  },
  "core_node4" : {
"core" : "foo_shard2_replica1",
"base_url" : "http://IP2:8983/solr;,
"node_name" : "IP2:8983_solr",
"state" : "recovering"
  }
}
  },
  "shard3" : {
"range" : "2aaa-7fff",
"state" : "active",
"replicas" : {
  "core_node2" : {
"core" : "foo_shard3_replica0",
"base_url" : "http://IP1:8983/solr;,
"node_name" : "IP1:8983_solr",
"state" : "active",
"leader" : "true"
  },
  "core_node5" : {
"core" : "foo_shard3_replica1",
"base_url" : "http://IP3:8983/solr;,
"node_name" : "IP3:8983_solr",
"state" : "recovering"
  }
}
  },
  "shard1" : {
"range" : "8000-d554",
"state" : "active",
"replicas" : {
  "core_node3" : {
"core" : "foo_shard1_replica0",
"base_url" : "http://IP1:8983/solr;,
"node_name" : "IP1:8983_solr",
"state" : "active",
"leader" : "true"
  },
  "core_node6" : {
"core" : "foo_shard1_replica1",
"base_url" : "http://IP4:8983/solr;,
"node_name" : "IP4:8983_solr",
"state" : "recovering"
  }
}
  }
},
"router" : {
  "name" : "compositeId"
},
"maxShardsPerNode" : "1",
"autoAddReplicas" : "false",
"znodeVersion" : 204,
"configName" : "test-one"
  }
},
"properties" : {
  "location" : "/mnt/solr_backups"
},
"live_nodes" : [
  "IP5:8983_solr",
  "IP3:8983_solr",
  "IP6:8983_solr",
  "IP4:8983_solr",
  "IP7:8983_solr",
  "IP1:8983_solr",
  "IP8:8983_solr",
  "IP9:8983_solr",
  "IP2:8983_solr"]
  }
}


On Fri, Sep 16, 2016 at 2:07 PM, Reth RM  wrote:

> Which version of solr? Afaik, until 6.1, solr backup and restore command
> apis required to do separate backup for each shard, and then restore in
> similar lines( both go for each). 6.1 version seems to have new feature of
> backing up entire collection records and then restoring it back to new
> collection setup(did not try yet).
>
>
> On Thu, Sep 15, 2016 at 1:45 PM, Stephen Lewis  wrote:
>
> > Hello,
> >
> > I have a solr cloud cluster in a test environment running 6.1 where I am
> > looking at using the collections API BACKUP and RESTORE commands to
> manage
> > data integrity.
> >
> > When restoring from a backup, I'm finding the same behavior occurs every
> > time; after the restore command, all shards are being hosted on one node.
> > What's especially surprising about this is that there are 6 live nodes
> > beforehand, the collection has maxShardsPerNode set to 1, and this occurs
> > even if I pass through the parameter maxShardsPerNode=1 to the API call.
> Is
> > there perhaps somewhere else I need to configure something, or another
> step
> > I am missing? If perhaps I'm misunderstanding the intention of these
> > parameters, could you clarify for me and let me know how to support
> > restoring different shards on 

Re: Solr Cloud Using Docker

2016-09-16 Thread John Bickerstaff
In case this is helpful - sponsored by Lucidworks

https://hub.docker.com/_/solr/

I can't speak to the pros and cons of using it in production except to say
that they are probably the same as the pros and cons of running Docker in
production (as in nothing is perfect)

HTH...

On Fri, Sep 16, 2016 at 2:37 PM, Brendan Grainger 
wrote:

> Hi,
>
> Does anyone used docker for deploying solr? I am using it for running a
> single solr server ‘cloud’ locally on my dev box, but wondering about the
> pros/cons of using it in production.
>
> Thanks,
> Brendan


Re: Distributing nodes with the collections API RESTORE command

2016-09-16 Thread Reth RM
Which version of solr? Afaik, until 6.1, solr backup and restore command
apis required to do separate backup for each shard, and then restore in
similar lines( both go for each). 6.1 version seems to have new feature of
backing up entire collection records and then restoring it back to new
collection setup(did not try yet).


On Thu, Sep 15, 2016 at 1:45 PM, Stephen Lewis  wrote:

> Hello,
>
> I have a solr cloud cluster in a test environment running 6.1 where I am
> looking at using the collections API BACKUP and RESTORE commands to manage
> data integrity.
>
> When restoring from a backup, I'm finding the same behavior occurs every
> time; after the restore command, all shards are being hosted on one node.
> What's especially surprising about this is that there are 6 live nodes
> beforehand, the collection has maxShardsPerNode set to 1, and this occurs
> even if I pass through the parameter maxShardsPerNode=1 to the API call. Is
> there perhaps somewhere else I need to configure something, or another step
> I am missing? If perhaps I'm misunderstanding the intention of these
> parameters, could you clarify for me and let me know how to support
> restoring different shards on different nodes?
>
> Full repro below.
>
> Thanks!
>
>
> *Repro*
>
> *Cluster state before*
>
> http://54.85.30.39:8983/solr/admin/collections?action=
> CLUSTERSTATUS=json
>
> {
>   "responseHeader" : {"status" : 0,"QTime" : 4},
>   "cluster" : {
> "collections" : {},
> "live_nodes" : [
>   "172.18.7.153:8983_solr",
>"172.18.2.20:8983_solr",
>"172.18.10.88:8983_solr",
>"172.18.6.224:8983_solr",
>"172.18.8.255:8983_solr",
>"172.18.2.21:8983_solr"]
>   }
> }
>
>
> *Restore Command (formatted for ease of reading)*
>
> http://54.85.30.39:8983/solr/admin/collections?action=RESTORE
>
> =panopto
> =backup-4
>
> =/mnt/beta_solr_backups
> =2016-09-02
>
> =1
>
> 
> 
> 0
> 16
> 
> backup-4
> 
>
>
> *Cluster state after*
>
> http://54.85.30.39:8983/solr/admin/collections?action=
> CLUSTERSTATUS=json
>
> {
>   "responseHeader" : {"status" : 0,"QTime" : 8},
>   "cluster" : {
> "collections" : {
>   "panopto" : {
> "replicationFactor" : "1",
> "shards" : {
>   "shard2" : {
> "range" : "0-7fff",
> "state" : "construction",
> "replicas" : {
>   "core_node1" : {
> "core" : "panopto_shard2_replica0",
> "base_url" : "http://172.18.2.21:8983/solr;,
> "node_name" : "172.18.2.21:8983_solr",
> "state" : "active",
> "leader" : "true"
>   }
> }
>   },
>   "shard1" : {
> "range" : "8000-",
> "state" : "construction",
> "replicas" : {
>   "core_node2" : {
> "core" : "panopto_shard1_replica0",
> "base_url" : "http://172.18.2.21:8983/solr;,
> "node_name" : "172.18.2.21:8983_solr",
> "state" : "active",
> "leader" : "true"
>   }
> }
>   }
> },
> "router" : {
>   "name" : "compositeId"
> },
> "maxShardsPerNode" : "1",
> "autoAddReplicas" : "false",
> "znodeVersion" : 44,
> "configName" : "panopto"
>   }
> },
> "live_nodes" : ["172.18.7.153:8983_solr", "172.18.2.20:8983_solr",
> "172.18.10.88:8983_solr", "172.18.6.224:8983_solr", "172.18.8.255:8983
> _solr",
> "172.18.2.21:8983_solr"]
>   }
> }
>
>
>
>
> --
> Stephen
>
> (206)753-9320
> stephen-lewis.net
>


Re: Exception is thrown when using TimestampUpdateProcessorFactory

2016-09-16 Thread Reth RM
Hi Preeti,

Try adding a default attribute to the solrtimestamp field in schema and
check if this resolves the issue.

replace  with correct default date format
https://cwiki.apache.org/confluence/display/solr/Defining+Fields


On Thu, Sep 15, 2016 at 5:32 AM, preeti kumari 
wrote:

> Hi All,
>
> I am trying to get solr index time as solrtimestamp field.
>
>
>  omitNorms="true"/>
>
> I am using solr 5.2.1 in solr cloud mode.
>
>
>  
>  
>solrtimestamp
>   
>   
> update-script.js
> 
>   example config parameter
> 
>  xnum,xnum2
>   
> 
> 
>  
>
> But I am getting below exception when i run update or through DIH. Please
> let me know how to fix this.
>
> java.lang.NullPointerException
> at
> org.apache.solr.update.processor.TimestampUpdateProcessorFactor
> y$1.getDefaultValue(TimestampUpdateProcessorFactory.java:66)
> at
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProc
> essorFactory$DefaultValueUpdateProcessor.processAdd(
> AbstractDefaultValueUpdateProcessorFactory.java:91)
> at
> org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
> at
> org.apache.solr.handler.dataimport.DataImportHandler$
> 1.upload(DataImportHandler.java:259)
> at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:524)
> at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:414)
> at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:
> 329)
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
> at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.
> java:416)
> at
> org.apache.solr.handler.dataimport.DataImporter.
> runCmd(DataImporter.java:480)
> at
> org.apache.solr.handler.dataimport.DataImporter$1.run(
> DataImporter.java:461)
>


Solr Cloud Using Docker

2016-09-16 Thread Brendan Grainger
Hi,

Does anyone used docker for deploying solr? I am using it for running a single 
solr server ‘cloud’ locally on my dev box, but wondering about the pros/cons of 
using it in production.

Thanks,
Brendan

Re: help with field definition

2016-09-16 Thread Gandham, Satya
Hi Emir,

   Thanks for your reply. But I’m afraid I’m not seeing the expected 
response. I’ve included the query and the corresponding debug portion of the 
response:

select?q=Justin\ Beiber=exactName_noAlias_en_US
 
 Debug:
 
"rawquerystring":"Justin\\ Beiber",
"querystring":"Justin\\ Beiber",
"parsedquery":"+((exactName_noAlias_en_US:justin 
exactName_noAlias_en_US:justin beiber)/no_coord) 
+exactName_noAlias_en_US:beiber",
"parsedquery_toString":"+(exactName_noAlias_en_US:justin 
exactName_noAlias_en_US:justin beiber) +exactName_noAlias_en_US:beiber",
"explain":{},


Satya.

On 9/16/16, 2:46 AM, "Emir Arnautovic"  wrote:

Hi,

I missed that you already did define field and you are having troubles 
with query (did not read stackoverflow). Added answer there, but just in 
case somebody else is having similar troubles, issue is how query is 
written - space has to be escaped:

   q=Justin\ Bieber

Regards,
Emir

On 13.09.2016 23:27, Gandham, Satya wrote:
> HI,
>
>I need help with defining a field ‘singerName’ with the right 
tokenizers and filters such that it gives me the below described behavior:
>
> I have a few documents as given below:
>
> Doc 1
>singerName: Justin Beiber
> Doc 2:
>singerName: Justin Timberlake
> …
>
>
> Below is the list of quries and the corresponding matches:
>
> Query 1: “My fav artist Justin Beiber is very impressive”
> Docs Matched : Doc1
>
> Query 2: “I have a Justin Timberlake poster on my wall”
> Docs Matched: Doc2
>
> Query 3: “The name Bieber Justin is unique”
> Docs Matched: None
>
> Query 4: “Timberlake is a lake of timber..?”
> Docs Matched: None.
>
> I have this described a bit more detailed here: 
http://stackoverflow.com/questions/39399321/solr-shingle-query-matching-keyword-tokenized-field
>
> I’d appreciate any help in addressing this problem.
>
> Thanks !!
>

-- 
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/





Re: Tutorial not working for me

2016-09-16 Thread Pritchett, James
Thanks for that. I totally get how it is with complicated, open source
projects. And from experience, I realize that beginner-level documentation
is really hard, especially with these kinds of projects: by the time you
get to documentation, everybody involved is so expert in all the details
that they can't imagine approaching from a blank slate.

Thanks for the suggestions. Had to chuckle, though: one of your links (
quora.com) is the one that I started with. Step 1: "Download Solr, actually
do the tutorial ..."

Best wishes,

James

On Fri, Sep 16, 2016 at 1:41 PM, John Bickerstaff 
wrote:

> I totally empathize about the sense of wasted time.  On Solr in particular
> I pulled my hair out for months - and I had access to people who had been
> using it for over two years!!!
>
> For what it's worth - this is kind of how it goes with most open source
> projects in my experience.  It's painful - and - the more moving parts the
> open source project has, the more painful the learning curve (usually)...
>
> But - the good news is that's why this list is here - we're all trying to
> help each other, so feel free to ping the list sooner rather than later
> when you're frustrated.  My new rule is one hour of being blocked...  I
> used to wait days - but everyone on the list seems to really understand how
> frustrating it is to be stuck and people have really taken time to help me
> - so I'm less hesitant.  And, of course, I try to pay it forward by
> contributing as much as I can in the same way.
>
> On that note: I've been particularly focused on working with Solr in terms
> of being able to keep upgrading simple by just replacing and re-indexing so
> if you have questions on that space (Disaster Recovery, Zookeeper config,
> etc) I may be able to help - and if you're looking for "plan" for building
> and maintaining a simple solrCloud working model on Ubuntu VMs on
> VirtualBox, I can *really* help you.
>
> Off the top of my head - some places to start:
>
> http://yonik.com/getting-started-with-solr/
> https://www.quora.com/What-is-the-best-way-to-learn-SOLR
> http://blog.outerthoughts.com/2015/11/learning-solr-comprehensively/
> http://www.solr-start.com/
>
> I think everyone responsible for those links is also a frequent "helper" on
> this email forum.
>
> Also (and I'm aware it's a glass half-full thing which frequently irritates
> me, but I'll say it anyway).  Having run into this problem I'm willing to
> wager you'll never forget this particular quirk and if you see the problem
> in future, you'll know exactly what's wrong.  It shouldn't have been
> "wrong" with the example, but for my part at least - I've begun to think of
> stuff like this as just part of the learning curve because it happens
> nearly every time.
>
> Software is hard - complex projects like SOLR are hard.  It's why we get
> paid to do stuff like this.  I'm actually getting paid pretty well right
> now because Solr is recognized as difficult and I have (with many thanks to
> this list) become known as someone who "knows Solr"...
>
> It *could* and *should* be better, but open source is what it is as a
> result of the sum total of what everyone has contributed - and we're all
> happy to help you as best we can.
>
>
>
> On Fri, Sep 16, 2016 at 11:13 AM, Pritchett, James <
> jpritch...@learningally.org> wrote:
>
> > Second possibility: You've somehow indexed fields as
> > "string" type rather than one of the text based fieldTypes.
> > "string" types are not tokenized, thus a field with
> > "My dog has fleas" will fail to find "My". It'll even not match
> > "my dog has fleas" (note capital "M").
> >
> > That appears to be the issue. Searching for name:Foundation indeed
> returns
> > the expected result. I will now go find some better entry point to SOLR
> > than the tutorial, which has wasted enough of my time for one day. Any
> > suggestions would be welcome.
> >
> > James
> >
> > On Fri, Sep 16, 2016 at 11:40 AM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> > > My bet:
> > > the fields (look in managed_schema or, possibly schema.xml)
> > > has stored="true" and indexed="false" set for the fields
> > > in question.
> > >
> > > Pretty much everyone takes a few passes before this really
> > > makes sense. "stored" means you see the results returned,
> > > "indexed" must be true before you can search on something.
> > >
> > > Second possibility: You've somehow indexed fields as
> > > "string" type rather than one of the text based fieldTypes.
> > > "string" types are not tokenized, thus a field with
> > > "My dog has fleas" will fail to find "My". It'll even not match
> > > "my dog has fleas" (note capital "M").
> > >
> > > The admin UI>>select core>>analysis page will show you
> > > lots of this kind of detail, although I admit it takes a bit to
> > > understand all the info (do un-check the "verbose" button
> > > for the nonce).
> > >
> > > Now, all that aside, please show us the field definition for

Re: Tutorial not working for me

2016-09-16 Thread John Bickerstaff
I totally empathize about the sense of wasted time.  On Solr in particular
I pulled my hair out for months - and I had access to people who had been
using it for over two years!!!

For what it's worth - this is kind of how it goes with most open source
projects in my experience.  It's painful - and - the more moving parts the
open source project has, the more painful the learning curve (usually)...

But - the good news is that's why this list is here - we're all trying to
help each other, so feel free to ping the list sooner rather than later
when you're frustrated.  My new rule is one hour of being blocked...  I
used to wait days - but everyone on the list seems to really understand how
frustrating it is to be stuck and people have really taken time to help me
- so I'm less hesitant.  And, of course, I try to pay it forward by
contributing as much as I can in the same way.

On that note: I've been particularly focused on working with Solr in terms
of being able to keep upgrading simple by just replacing and re-indexing so
if you have questions on that space (Disaster Recovery, Zookeeper config,
etc) I may be able to help - and if you're looking for "plan" for building
and maintaining a simple solrCloud working model on Ubuntu VMs on
VirtualBox, I can *really* help you.

Off the top of my head - some places to start:

http://yonik.com/getting-started-with-solr/
https://www.quora.com/What-is-the-best-way-to-learn-SOLR
http://blog.outerthoughts.com/2015/11/learning-solr-comprehensively/
http://www.solr-start.com/

I think everyone responsible for those links is also a frequent "helper" on
this email forum.

Also (and I'm aware it's a glass half-full thing which frequently irritates
me, but I'll say it anyway).  Having run into this problem I'm willing to
wager you'll never forget this particular quirk and if you see the problem
in future, you'll know exactly what's wrong.  It shouldn't have been
"wrong" with the example, but for my part at least - I've begun to think of
stuff like this as just part of the learning curve because it happens
nearly every time.

Software is hard - complex projects like SOLR are hard.  It's why we get
paid to do stuff like this.  I'm actually getting paid pretty well right
now because Solr is recognized as difficult and I have (with many thanks to
this list) become known as someone who "knows Solr"...

It *could* and *should* be better, but open source is what it is as a
result of the sum total of what everyone has contributed - and we're all
happy to help you as best we can.



On Fri, Sep 16, 2016 at 11:13 AM, Pritchett, James <
jpritch...@learningally.org> wrote:

> Second possibility: You've somehow indexed fields as
> "string" type rather than one of the text based fieldTypes.
> "string" types are not tokenized, thus a field with
> "My dog has fleas" will fail to find "My". It'll even not match
> "my dog has fleas" (note capital "M").
>
> That appears to be the issue. Searching for name:Foundation indeed returns
> the expected result. I will now go find some better entry point to SOLR
> than the tutorial, which has wasted enough of my time for one day. Any
> suggestions would be welcome.
>
> James
>
> On Fri, Sep 16, 2016 at 11:40 AM, Erick Erickson 
> wrote:
>
> > My bet:
> > the fields (look in managed_schema or, possibly schema.xml)
> > has stored="true" and indexed="false" set for the fields
> > in question.
> >
> > Pretty much everyone takes a few passes before this really
> > makes sense. "stored" means you see the results returned,
> > "indexed" must be true before you can search on something.
> >
> > Second possibility: You've somehow indexed fields as
> > "string" type rather than one of the text based fieldTypes.
> > "string" types are not tokenized, thus a field with
> > "My dog has fleas" will fail to find "My". It'll even not match
> > "my dog has fleas" (note capital "M").
> >
> > The admin UI>>select core>>analysis page will show you
> > lots of this kind of detail, although I admit it takes a bit to
> > understand all the info (do un-check the "verbose" button
> > for the nonce).
> >
> > Now, all that aside, please show us the field definition for
> > one of the fields in question and, as John mentions, the exact
> > query (I'd also ass =true to the results).
> >
> > Saying you followed the exact instructions somewhere isn't
> > really helpful. It's likely that there's something innocent-seeming
> > that was done differently. Giving the information asked for
> > will help us diagnose what's happening and, perhaps,
> > improve the docs if we can understand the mis-match.
> >
> > Best,
> > Erick
> >
> > On Fri, Sep 16, 2016 at 8:28 AM, Pritchett, James
> >  wrote:
> > > I am following the exact instructions in the tutorial: copy and pasting
> > all
> > > commands & queries from the tutorial:
> > > https://lucene.apache.org/solr/quickstart.html. Where it breaks down
> is
> > > this one:
> > >
> > > 

Re: slow updates/searches

2016-09-16 Thread Rallavagu

Comments in line...

On 9/16/16 10:15 AM, Erick Erickson wrote:

Well, the next thing I'd look at is CPU activity. If you're flooding the system
with updates there'll be CPU contention.


Monitoring does not suggest any high CPU but as you can see from vmstat 
output "user" cpu is a bit high during updates that are taking time (34 
user, 65 idle).




And there are a number of things you can do that make updates in particular
much less efficient, from committing very frequently (sometimes combined
with excessive autowarm parameters) and the like.


softCommit is set to 10 minutes, autowarm count is set to 0 and commit 
is set to 15 sec for NRT.




There are a series of ideas that might trigger an "aha" moment:
https://wiki.apache.org/solr/SolrPerformanceFactors


Reviewed this document and made few changes accordingly a while ago.


But the crude measure is just to look at CPU usage when updates happen, or
just before. Are you running hot with queries alone then add an update burden?


Essentially, it is high QTimes for queries got me looking into logs, 
system etc and I could correlate updates slowness and searching 
slowness. Some other time QTimes go high is right after softCommit which 
is expected.


Wondering what causes update threads wait and if it has any impact on 
search at all. I had couple of more CPUs added but I still see similar 
behavior.


Thanks.



Best,
Erick

On Fri, Sep 16, 2016 at 9:19 AM, Rallavagu  wrote:

Erick,

Was monitoring GC activity and couldn't align GC pauses to this behavior.
Also, the vmstat shows no swapping or cpu I/O wait. However, whenever I see
high update response times (corresponding high QTimes for searches) vmstat
shows as series of number of "waiting to runnable" processes in "r" column
of "procs" section.

https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.05.51%20AM.png

procs ---memory-- ---swap--
-io -system-- cpu -timestamp-
 r  b swpd freeinact   active   si   so bibo
in   cs  us  sy  id  wa  st CDT
 2  071068 18688496  2526604 2420444000 0 0
1433  462  27   1  73   0   0 2016-09-16 11:02:32
 1  071068 18688180  2526600 2420456800 0 0
1388  404  26   1  74   0   0 2016-09-16 11:02:33
 1  071068 18687928  2526600 2420456800 0 0
1354  401  25   0  75   0   0 2016-09-16 11:02:34
 1  071068 18687800  2526600 2420457200 0 0
1311  397  25   0  74   0   0 2016-09-16 11:02:35
 1  071068 18687164  2527116 2420484400 0 0
1770  702  31   1  69   0   0 2016-09-16 11:02:36
 1  071068 18686944  2527108 2420490800 052
1266  421  26   0  74   0   0 2016-09-16 11:02:37
12  171068 18682676  2528560 2420711600 0   280
2388  934  34   1  65   0   0 2016-09-16 11:02:38
 2  171068 18651340  2530820 2423336800 0  1052
10258 5696  82   5  13   0   0 2016-09-16 11:02:39
 5  071068 18648600  2530112 2423506000 0  1988
7261 3644  84   2  13   1   0 2016-09-16 11:02:40
 9  171068 18647804  2530580 2423607600 0  1688
7031 3575  84   2  13   1   0 2016-09-16 11:02:41
 1  071068 18647628  2530364 2423625600 0   680
7065 4463  61   3  35   1   0 2016-09-16 11:02:42
 1  071068 18646344  2531204 2423653600 044
6422 4922  35   3  63   0   0 2016-09-16 11:02:43
 2  071068 18644460  2532196 2423744000 0 0
6561 5056  25   3  72   0   0 2016-09-16 11:02:44
 0  071068 18661900  2531724 2421876400 0 0
7312 10050  11   3  86   0   0 2016-09-16 11:02:45
 2  071068 18649400  2532228 2422980000 0 0
7211 6222  34   3  63   0   0 2016-09-16 11:02:46
 0  071068 18648280  2533440 2423030000 0   108
3936 3381  20   1  79   0   0 2016-09-16 11:02:47
 0  071068 18648156  2533212 2423068400 012
1279 1681   2   0  97   0   0 2016-09-16 11:02:48


Captured stack trace including timing for one of the update threads.


org.eclipse.jetty.server.handler.ContextHandler:doHandle (method time = 15
ms, total time = 30782 ms)
 Filter - SolrDispatchFilter:doFilter:181 (method time = 0 ms, total time =
30767 ms)
  Filter - SolrDispatchFilter:doFilter:223 (method time = 0 ms, total time =
30767 ms)
   org.apache.solr.servlet.HttpSolrCall:call:457 (method time = 0 ms, total
time = 30767 ms)
org.apache.solr.servlet.HttpSolrCall:execute:658 (method time = 0 ms,
total time = 30767 ms)
 org.apache.solr.core.SolrCore:execute:2073 (method time = 0 ms, total
time = 30767 ms)
  

Re: slow updates/searches

2016-09-16 Thread Erick Erickson
Well, the next thing I'd look at is CPU activity. If you're flooding the system
with updates there'll be CPU contention.

And there are a number of things you can do that make updates in particular
much less efficient, from committing very frequently (sometimes combined
with excessive autowarm parameters) and the like.

There are a series of ideas that might trigger an "aha" moment:
https://wiki.apache.org/solr/SolrPerformanceFactors

But the crude measure is just to look at CPU usage when updates happen, or
just before. Are you running hot with queries alone then add an update burden?

Best,
Erick

On Fri, Sep 16, 2016 at 9:19 AM, Rallavagu  wrote:
> Erick,
>
> Was monitoring GC activity and couldn't align GC pauses to this behavior.
> Also, the vmstat shows no swapping or cpu I/O wait. However, whenever I see
> high update response times (corresponding high QTimes for searches) vmstat
> shows as series of number of "waiting to runnable" processes in "r" column
> of "procs" section.
>
> https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.05.51%20AM.png
>
> procs ---memory-- ---swap--
> -io -system-- cpu -timestamp-
>  r  b swpd freeinact   active   si   so bibo
> in   cs  us  sy  id  wa  st CDT
>  2  071068 18688496  2526604 2420444000 0 0
> 1433  462  27   1  73   0   0 2016-09-16 11:02:32
>  1  071068 18688180  2526600 2420456800 0 0
> 1388  404  26   1  74   0   0 2016-09-16 11:02:33
>  1  071068 18687928  2526600 2420456800 0 0
> 1354  401  25   0  75   0   0 2016-09-16 11:02:34
>  1  071068 18687800  2526600 2420457200 0 0
> 1311  397  25   0  74   0   0 2016-09-16 11:02:35
>  1  071068 18687164  2527116 2420484400 0 0
> 1770  702  31   1  69   0   0 2016-09-16 11:02:36
>  1  071068 18686944  2527108 2420490800 052
> 1266  421  26   0  74   0   0 2016-09-16 11:02:37
> 12  171068 18682676  2528560 2420711600 0   280
> 2388  934  34   1  65   0   0 2016-09-16 11:02:38
>  2  171068 18651340  2530820 2423336800 0  1052
> 10258 5696  82   5  13   0   0 2016-09-16 11:02:39
>  5  071068 18648600  2530112 2423506000 0  1988
> 7261 3644  84   2  13   1   0 2016-09-16 11:02:40
>  9  171068 18647804  2530580 2423607600 0  1688
> 7031 3575  84   2  13   1   0 2016-09-16 11:02:41
>  1  071068 18647628  2530364 2423625600 0   680
> 7065 4463  61   3  35   1   0 2016-09-16 11:02:42
>  1  071068 18646344  2531204 2423653600 044
> 6422 4922  35   3  63   0   0 2016-09-16 11:02:43
>  2  071068 18644460  2532196 2423744000 0 0
> 6561 5056  25   3  72   0   0 2016-09-16 11:02:44
>  0  071068 18661900  2531724 2421876400 0 0
> 7312 10050  11   3  86   0   0 2016-09-16 11:02:45
>  2  071068 18649400  2532228 2422980000 0 0
> 7211 6222  34   3  63   0   0 2016-09-16 11:02:46
>  0  071068 18648280  2533440 2423030000 0   108
> 3936 3381  20   1  79   0   0 2016-09-16 11:02:47
>  0  071068 18648156  2533212 2423068400 012
> 1279 1681   2   0  97   0   0 2016-09-16 11:02:48
>
>
> Captured stack trace including timing for one of the update threads.
>
>
> org.eclipse.jetty.server.handler.ContextHandler:doHandle (method time = 15
> ms, total time = 30782 ms)
>  Filter - SolrDispatchFilter:doFilter:181 (method time = 0 ms, total time =
> 30767 ms)
>   Filter - SolrDispatchFilter:doFilter:223 (method time = 0 ms, total time =
> 30767 ms)
>org.apache.solr.servlet.HttpSolrCall:call:457 (method time = 0 ms, total
> time = 30767 ms)
> org.apache.solr.servlet.HttpSolrCall:execute:658 (method time = 0 ms,
> total time = 30767 ms)
>  org.apache.solr.core.SolrCore:execute:2073 (method time = 0 ms, total
> time = 30767 ms)
>   org.apache.solr.handler.RequestHandlerBase:handleRequest:156 (method
> time = 0 ms, total time = 30767 ms)
>
> org.apache.solr.handler.ContentStreamHandlerBase:handleRequestBody:70
> (method time = 0 ms, total time = 30767 ms)
> org.apache.solr.handler.UpdateRequestHandler$1:load:95 (method time
> = 0 ms, total time = 23737 ms)
>  org.apache.solr.handler.loader.XMLLoader:load:178 (method time = 0
> ms, total time = 23737 ms)
>   org.apache.solr.handler.loader.XMLLoader:processUpdate:251 (method
> time = 0 ms, total time = 23737 ms)
>
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:processAdd:104
> (method time = 0 ms, total time = 23737 ms)
>
> 

Re: Tutorial not working for me

2016-09-16 Thread Pritchett, James
Second possibility: You've somehow indexed fields as
"string" type rather than one of the text based fieldTypes.
"string" types are not tokenized, thus a field with
"My dog has fleas" will fail to find "My". It'll even not match
"my dog has fleas" (note capital "M").

That appears to be the issue. Searching for name:Foundation indeed returns
the expected result. I will now go find some better entry point to SOLR
than the tutorial, which has wasted enough of my time for one day. Any
suggestions would be welcome.

James

On Fri, Sep 16, 2016 at 11:40 AM, Erick Erickson 
wrote:

> My bet:
> the fields (look in managed_schema or, possibly schema.xml)
> has stored="true" and indexed="false" set for the fields
> in question.
>
> Pretty much everyone takes a few passes before this really
> makes sense. "stored" means you see the results returned,
> "indexed" must be true before you can search on something.
>
> Second possibility: You've somehow indexed fields as
> "string" type rather than one of the text based fieldTypes.
> "string" types are not tokenized, thus a field with
> "My dog has fleas" will fail to find "My". It'll even not match
> "my dog has fleas" (note capital "M").
>
> The admin UI>>select core>>analysis page will show you
> lots of this kind of detail, although I admit it takes a bit to
> understand all the info (do un-check the "verbose" button
> for the nonce).
>
> Now, all that aside, please show us the field definition for
> one of the fields in question and, as John mentions, the exact
> query (I'd also ass =true to the results).
>
> Saying you followed the exact instructions somewhere isn't
> really helpful. It's likely that there's something innocent-seeming
> that was done differently. Giving the information asked for
> will help us diagnose what's happening and, perhaps,
> improve the docs if we can understand the mis-match.
>
> Best,
> Erick
>
> On Fri, Sep 16, 2016 at 8:28 AM, Pritchett, James
>  wrote:
> > I am following the exact instructions in the tutorial: copy and pasting
> all
> > commands & queries from the tutorial:
> > https://lucene.apache.org/solr/quickstart.html. Where it breaks down is
> > this one:
> >
> > http://localhost:8983/solr/gettingstarted/select?wt=json;
> indent=true=name:foundation
> >
> > This returns no results. Tried in the web admin view as well, also tried
> > various field:value combinations to no avail. Clearly something didn't
> get
> > configured correctly, but I saw no error messages when running all the
> data
> > loads, etc. given in the tutorial.
> >
> > Sorry to be so clueless, but I don't really have anything to go on for
> > troubleshooting besides asking dumb questions.
> >
> > James
> >
> > On Fri, Sep 16, 2016 at 11:24 AM, John Bickerstaff <
> j...@johnbickerstaff.com
> >> wrote:
> >
> >> Please share the exact query syntax?
> >>
> >> Are you using a collection you built or one of the examples?
> >>
> >> On Fri, Sep 16, 2016 at 9:06 AM, Pritchett, James <
> >> jpritch...@learningally.org> wrote:
> >>
> >> > I apologize if this is a really stupid question. I followed all
> >> > instructions on installing Tutorial, got data loaded, everything works
> >> > great until I try to query with a field name -- e.g.,
> name:foundation. I
> >> > get zero results from this or any other query which specifies a field
> >> name.
> >> > Simple queries return results, and the field names are listed in those
> >> > results correctly. But if I query using names that I know are there
> and
> >> > values that I know are there, I get nothing.
> >> >
> >> > I figure this must be something basic that is not right about the way
> >> > things have gotten set up, but I am completely blocked at this point.
> I
> >> > tried blowing it all away and restarting from scratch with no luck.
> Where
> >> > should I be looking for problems here? I am running this on a MacBook,
> >> OS X
> >> > 10.9, latest JDK (1.8).
> >> >
> >> > James
> >> >
> >> > --
> >> >
> >> >
> >> > *James Pritchett*
> >> >
> >> > Leader, Process Redesign and Analysis
> >> >
> >> > __
> >> >
> >> >
> >> > *Learning Ally™*Together It’s Possible
> >> > 20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608
> >> >
> >> > jpritch...@learningally.org
> >> >
> >> > www.LearningAlly.org 
> >> >
> >> > Join us in building a community that helps blind, visually impaired &
> >> > dyslexic students thrive.
> >> >
> >> > Connect with our community: *Facebook*
> >> >  | *Twitter*
> >> >  | *LinkedIn*
> >> >  |
> >> > *Explore1in5*  | *Instagram*
> >> >  | *Sign up for our community
> >> > newsletter*  >> > touch/>
> >> >
> >> > Support us: 

Re: Searching Special charterer in solr behaving inconsistent

2016-09-16 Thread Erick Erickson
You really have to define _how_ it's not working, provide field definitions,
perhaps the result of adding =query to the URL. You might review:

http://wiki.apache.org/solr/UsingMailingLists

At a guess and based on fragmentary information, your second query is
searching against the default field, often "text" which I'd guess has a much
different analysis chain than title_v.

Best,
Erick

On Fri, Sep 16, 2016 at 9:00 AM, shekhar  wrote:
> Problem: Searching Special charterer in solr behaving inconsistently
>
> Example: search character ¢ , £ , ¤ , ¥ , ¦ , §
>
> By searching this character by using particular field Like say title__v,
> result is coming as expected.
>
> *Query:*
>
> solr/trucollection/select?q=title__v:(*%C2%A2%5C+%2C%5C+%C2%A3%5C+%2C%5C+%C2%A4%5C+%2C%5C+%C2%A5%5C+%2C%5C+%C2%A6%5C+%2C%5C+%C2%A7*
> OR "%C2%A2+%2C+%C2%A3+%2C+%C2%A4+%2C+%C2%A5+%2C+%C2%A6+%2C+%C2%A7") AND
> (isrestricted:false OR groupid:("07C90F45-AA61-4B2A-A446-BEA13C8409E2" OR
> "23414BF3-046D-475A-9268-AA7BF5238DB3" OR
> "50FC6521-DBCC-4097-A3BA-03AE6C202C2F"))=0=50
> =owning_facility__v
> desc=on=shard=owning_facility__v=status__v=type__v=classification__v
> bulk_erp_code product_categorys owning_department__v docobjid erp_code
> category_of_change shard groupid title__v author brands hassupportingdocs
> last_periodic_review_date__vs technology__c document_language__c
> effective_date__v reason_code__c owning_segments__c isrestricted region__c
> document_number__v fileuniquename approved_date__vs impacted_departments__v
> severity id owning_facility__v change_coordinator lifecycle_categories__c
> status__v next_periodic_review_date__vs subtype__v version__v fg_erp_code
> impacted_facilities__v type__v franchises country__v policy
> legacy_document_number__c formula_number marketing_companys
> impacted_segments__c=json
>
> *
> But when we use the same search criteria by using quick search(full text
> search), It is not working*
>
> *Query:*
>
> solr/trucollection/select?q=(*%C2%A2+%2C+%C2%A3+%2C+%C2%A4+%2C+%C2%A5+%2C+%C2%A6+%2C+%C2%A7*
> OR "%C2%A2+%2C+%C2%A3+%2C+%C2%A4+%2C+%C2%A5+%2C+%C2%A6+%2C+%C2%A7") AND
> (isrestricted:false OR groupid="07C90F45-AA61-4B2A-A446-BEA13C8409E2" OR
> "23414BF3-046D-475A-9268-AA7BF5238DB3" OR
> "50FC6521-DBCC-4097-A3BA-03AE6C202C2F")=0=50
> =owning_facility__v
> desc=on=shard=owning_facility__v=status__v=type__v=classification__v
> bulk_erp_code product_categorys owning_department__v docobjid erp_code
> category_of_change shard groupid title__v author brands hassupportingdocs
> last_periodic_review_date__vs technology__c document_language__c
> effective_date__v reason_code__c owning_segments__c isrestricted region__c
> document_number__v fileuniquename approved_date__vs impacted_departments__v
> severity id owning_facility__v change_coordinator lifecycle_categories__c
> status__v next_periodic_review_date__vs subtype__v version__v fg_erp_code
> impacted_facilities__v type__v franchises country__v policy
> legacy_document_number__c formula_number marketing_companys
> impacted_segments__c=json
>
>
> Can anyone help me out.
>
> Best regards,
> Shekhar
>
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Searching-Special-charterer-in-solr-behaving-inconsistent-tp4296493.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Searching Special charterer in solr behaving inconsistent

2016-09-16 Thread shekhar
Problem: Searching Special charterer in solr behaving inconsistently 

Example: search character ¢ , £ , ¤ , ¥ , ¦ , §

By searching this character by using particular field Like say title__v,
result is coming as expected.

*Query:*

solr/trucollection/select?q=title__v:(*%C2%A2%5C+%2C%5C+%C2%A3%5C+%2C%5C+%C2%A4%5C+%2C%5C+%C2%A5%5C+%2C%5C+%C2%A6%5C+%2C%5C+%C2%A7*
OR "%C2%A2+%2C+%C2%A3+%2C+%C2%A4+%2C+%C2%A5+%2C+%C2%A6+%2C+%C2%A7") AND
(isrestricted:false OR groupid:("07C90F45-AA61-4B2A-A446-BEA13C8409E2" OR
"23414BF3-046D-475A-9268-AA7BF5238DB3" OR
"50FC6521-DBCC-4097-A3BA-03AE6C202C2F"))=0=50
=owning_facility__v
desc=on=shard=owning_facility__v=status__v=type__v=classification__v
bulk_erp_code product_categorys owning_department__v docobjid erp_code
category_of_change shard groupid title__v author brands hassupportingdocs
last_periodic_review_date__vs technology__c document_language__c
effective_date__v reason_code__c owning_segments__c isrestricted region__c
document_number__v fileuniquename approved_date__vs impacted_departments__v
severity id owning_facility__v change_coordinator lifecycle_categories__c
status__v next_periodic_review_date__vs subtype__v version__v fg_erp_code
impacted_facilities__v type__v franchises country__v policy
legacy_document_number__c formula_number marketing_companys
impacted_segments__c=json

*
But when we use the same search criteria by using quick search(full text
search), It is not working*

*Query:*

solr/trucollection/select?q=(*%C2%A2+%2C+%C2%A3+%2C+%C2%A4+%2C+%C2%A5+%2C+%C2%A6+%2C+%C2%A7*
OR "%C2%A2+%2C+%C2%A3+%2C+%C2%A4+%2C+%C2%A5+%2C+%C2%A6+%2C+%C2%A7") AND
(isrestricted:false OR groupid="07C90F45-AA61-4B2A-A446-BEA13C8409E2" OR
"23414BF3-046D-475A-9268-AA7BF5238DB3" OR
"50FC6521-DBCC-4097-A3BA-03AE6C202C2F")=0=50
=owning_facility__v
desc=on=shard=owning_facility__v=status__v=type__v=classification__v
bulk_erp_code product_categorys owning_department__v docobjid erp_code
category_of_change shard groupid title__v author brands hassupportingdocs
last_periodic_review_date__vs technology__c document_language__c
effective_date__v reason_code__c owning_segments__c isrestricted region__c
document_number__v fileuniquename approved_date__vs impacted_departments__v
severity id owning_facility__v change_coordinator lifecycle_categories__c
status__v next_periodic_review_date__vs subtype__v version__v fg_erp_code
impacted_facilities__v type__v franchises country__v policy
legacy_document_number__c formula_number marketing_companys
impacted_segments__c=json


Can anyone help me out.

Best regards,
Shekhar







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-Special-charterer-in-solr-behaving-inconsistent-tp4296493.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tutorial not working for me

2016-09-16 Thread Pritchett, James
I looked at the managed-schema and it does appear that fields like "name"
were not indexed (if I'm reading this correctly):



Not sure if this is because some step was missed, the post command was done
incorrectly, or what. Tutorial says nothing about schemas or indexes. For
reference, here's the entire tutorial run, copied and pasted from my
terminal window with just a few ellipses to collapse the 4000+ document
loads and query results.

marplon:solr-6.2.0 jpritchett$ ./bin/solr start -e cloud -noprompt

Welcome to the SolrCloud example!

Starting up 2 Solr nodes for your example SolrCloud cluster.

Creating Solr home directory
/Users/jpritchett/solr-6.2.0/example/cloud/node1/solr
Cloning /Users/jpritchett/solr-6.2.0/example/cloud/node1 into
   /Users/jpritchett/solr-6.2.0/example/cloud/node2

Starting up Solr on port 8983 using command:
bin/solr start -cloud -p 8983 -s "example/cloud/node1/solr"

Waiting up to 30 seconds to see Solr running on port 8983 [-]
Started Solr server on port 8983 (pid=8216). Happy searching!


Starting up Solr on port 7574 using command:
bin/solr start -cloud -p 7574 -s "example/cloud/node2/solr" -z
localhost:9983

Waiting up to 30 seconds to see Solr running on port 7574 [/]
Started Solr server on port 7574 (pid=8408). Happy searching!


Connecting to ZooKeeper at localhost:9983 ...
Uploading
/Users/jpritchett/solr-6.2.0/server/solr/configsets/data_driven_schema_configs/conf
for config gettingstarted to ZooKeeper at localhost:9983

Creating new collection 'gettingstarted' using command:
http://localhost:8983/solr/admin/collections?action=CREATE=gettingstarted=2=2=2=gettingstarted

{
  "responseHeader":{
"status":0,
"QTime":16272},
  "success":{
"172.16.3.78:8983_solr":{
  "responseHeader":{
"status":0,
"QTime":7526},
  "core":"gettingstarted_shard1_replica2"},
"172.16.3.78:7574_solr":{
  "responseHeader":{
"status":0,
"QTime":7838},
  "core":"gettingstarted_shard2_replica1"}}}

Enabling auto soft-commits with maxTime 3 secs using the Config API

POSTing request to Config API:
http://localhost:8983/solr/gettingstarted/config
{"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
Successfully set-property updateHandler.autoSoftCommit.maxTime to 3000


SolrCloud example running, please visit: http://localhost:8983/solr

marplon:solr-6.2.0 jpritchett$ bin/post -c gettingstarted docs/
java -classpath /Users/jpritchett/solr-6.2.0/dist/solr-core-6.2.0.jar
-Dauto=yes -Dc=gettingstarted -Ddata=files -Drecursive=yes
org.apache.solr.util.SimplePostTool docs/
SimplePostTool version 5.0.0
Posting files to [base] url
http://localhost:8983/solr/gettingstarted/update...
Entering auto mode. File endings considered are
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
Entering recursive mode, max depth=999, delay=0s
Indexing directory docs (3 files, depth=0)
POSTing file index.html (text/html) to [base]/extract
POSTing file quickstart.html (text/html) to [base]/extract

[etc.]

POSTing file SolrVelocityResourceLoader.html (text/html) to [base]/extract
POSTing file VelocityResponseWriter.html (text/html) to [base]/extract
4329 files indexed.
COMMITting Solr index changes to
http://localhost:8983/solr/gettingstarted/update...
Time spent: 0:06:03.224
marplon:solr-6.2.0 jpritchett$ bin/post -c gettingstarted
example/exampledocs/*.xml
java -classpath /Users/jpritchett/solr-6.2.0/dist/solr-core-6.2.0.jar
-Dauto=yes -Dc=gettingstarted -Ddata=files
org.apache.solr.util.SimplePostTool example/exampledocs/gb18030-example.xml
example/exampledocs/hd.xml example/exampledocs/ipod_other.xml
example/exampledocs/ipod_video.xml example/exampledocs/manufacturers.xml
example/exampledocs/mem.xml example/exampledocs/money.xml
example/exampledocs/monitor.xml example/exampledocs/monitor2.xml
example/exampledocs/mp500.xml example/exampledocs/sd500.xml
example/exampledocs/solr.xml example/exampledocs/utf8-example.xml
example/exampledocs/vidcard.xml
SimplePostTool version 5.0.0
Posting files to [base] url
http://localhost:8983/solr/gettingstarted/update...
Entering auto mode. File endings considered are
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file gb18030-example.xml (application/xml) to [base]
POSTing file hd.xml (application/xml) to [base]

[etc.]

14 files indexed.
COMMITting Solr index changes to
http://localhost:8983/solr/gettingstarted/update...
Time spent: 0:00:07.578
marplon:solr-6.2.0 jpritchett$ bin/post -c gettingstarted
example/exampledocs/books.json
java -classpath /Users/jpritchett/solr-6.2.0/dist/solr-core-6.2.0.jar
-Dauto=yes -Dc=gettingstarted -Ddata=files
org.apache.solr.util.SimplePostTool example/exampledocs/books.json
SimplePostTool version 5.0.0
Posting files to [base] url
http://localhost:8983/solr/gettingstarted/update...
Entering auto mode. File endings considered are

Re: slow updates/searches

2016-09-16 Thread Rallavagu

Erick,

Was monitoring GC activity and couldn't align GC pauses to this 
behavior. Also, the vmstat shows no swapping or cpu I/O wait. However, 
whenever I see high update response times (corresponding high QTimes for 
searches) vmstat shows as series of number of "waiting to runnable" 
processes in "r" column of "procs" section.


https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.05.51%20AM.png

procs ---memory-- ---swap-- 
-io -system-- cpu -timestamp-
 r  b swpd freeinact   active   si   so 
bibo   in   cs  us  sy  id  wa  st CDT
 2  071068 18688496  2526604 2420444000 
0 0 1433  462  27   1  73   0   0 2016-09-16 11:02:32
 1  071068 18688180  2526600 2420456800 
0 0 1388  404  26   1  74   0   0 2016-09-16 11:02:33
 1  071068 18687928  2526600 2420456800 
0 0 1354  401  25   0  75   0   0 2016-09-16 11:02:34
 1  071068 18687800  2526600 2420457200 
0 0 1311  397  25   0  74   0   0 2016-09-16 11:02:35
 1  071068 18687164  2527116 2420484400 
0 0 1770  702  31   1  69   0   0 2016-09-16 11:02:36
 1  071068 18686944  2527108 2420490800 
052 1266  421  26   0  74   0   0 2016-09-16 11:02:37
12  171068 18682676  2528560 2420711600 
0   280 2388  934  34   1  65   0   0 2016-09-16 11:02:38
 2  171068 18651340  2530820 2423336800 
0  1052 10258 5696  82   5  13   0   0 2016-09-16 11:02:39
 5  071068 18648600  2530112 2423506000 
0  1988 7261 3644  84   2  13   1   0 2016-09-16 11:02:40
 9  171068 18647804  2530580 2423607600 
0  1688 7031 3575  84   2  13   1   0 2016-09-16 11:02:41
 1  071068 18647628  2530364 2423625600 
0   680 7065 4463  61   3  35   1   0 2016-09-16 11:02:42
 1  071068 18646344  2531204 2423653600 
044 6422 4922  35   3  63   0   0 2016-09-16 11:02:43
 2  071068 18644460  2532196 2423744000 
0 0 6561 5056  25   3  72   0   0 2016-09-16 11:02:44
 0  071068 18661900  2531724 2421876400 
0 0 7312 10050  11   3  86   0   0 2016-09-16 11:02:45
 2  071068 18649400  2532228 2422980000 
0 0 7211 6222  34   3  63   0   0 2016-09-16 11:02:46
 0  071068 18648280  2533440 2423030000 
0   108 3936 3381  20   1  79   0   0 2016-09-16 11:02:47
 0  071068 18648156  2533212 2423068400 
012 1279 1681   2   0  97   0   0 2016-09-16 11:02:48



Captured stack trace including timing for one of the update threads.


org.eclipse.jetty.server.handler.ContextHandler:doHandle (method time = 
15 ms, total time = 30782 ms)
 Filter - SolrDispatchFilter:doFilter:181 (method time = 0 ms, total 
time = 30767 ms)
  Filter - SolrDispatchFilter:doFilter:223 (method time = 0 ms, total 
time = 30767 ms)
   org.apache.solr.servlet.HttpSolrCall:call:457 (method time = 0 ms, 
total time = 30767 ms)
org.apache.solr.servlet.HttpSolrCall:execute:658 (method time = 0 
ms, total time = 30767 ms)
 org.apache.solr.core.SolrCore:execute:2073 (method time = 0 ms, 
total time = 30767 ms)
  org.apache.solr.handler.RequestHandlerBase:handleRequest:156 
(method time = 0 ms, total time = 30767 ms)


org.apache.solr.handler.ContentStreamHandlerBase:handleRequestBody:70 
(method time = 0 ms, total time = 30767 ms)
org.apache.solr.handler.UpdateRequestHandler$1:load:95 (method 
time = 0 ms, total time = 23737 ms)
 org.apache.solr.handler.loader.XMLLoader:load:178 (method time 
= 0 ms, total time = 23737 ms)
  org.apache.solr.handler.loader.XMLLoader:processUpdate:251 
(method time = 0 ms, total time = 23737 ms)


org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:processAdd:104 
(method time = 0 ms, total time = 23737 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:processAdd:702 
(method time = 0 ms, total time = 23737 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:versionAdd:1011 
(method time = 0 ms, total time = 23477 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:getUpdatedDocument:1114 
(method time = 0 ms, total time = 51 ms)


org.apache.solr.update.processor.AtomicUpdateDocumentMerger:merge:110 
(method time = 51 ms, total time = 51 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:doLocalAdd:924 
(method time = 0 ms, total time = 23426 ms)


org.apache.solr.update.processor.UpdateRequestProcessor:processAdd:49 
(method time = 0 ms, total time = 23426 ms)


org.apache.solr.update.processor.RunUpdateProcessor:processAdd:69 
(method time = 0 

Re: Best way to generate multivalue fields from streaming API

2016-09-16 Thread Joel Bernstein
Unfortunately there currently isn't a way to split a field. But this would
be nice functionality to add.

The approach would be to an add a split operation that would be used by the
select() function. It would look like this:

select(jdbc(...), split(fieldA, delim=","), ...)

This would make a good jira issue.






Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Sep 16, 2016 at 11:03 AM, Mike Thomsen 
wrote:

> Read this article and thought it could be interesting as a way to do
> ingestion:
>
> https://dzone.com/articles/solr-streaming-expressions-
> for-collection-auto-upd-1
>
> Example from the article:
>
> daemon(id="12345",
>
>  runInterval="6",
>
>  update(users,
>
>  batchSize=10,
>
>  jdbc(connection="jdbc:mysql://localhost/users?user=root=solr",
> sql="SELECT id, name FROM users", sort="id asc",
> driver="com.mysql.jdbc.Driver")
>
> )
>
> What's the best way to handle a multivalue field using this API? Is
> there a way to tokenize something returned in a database field?
>
> Thanks,
>
> Mike
>


Re: Tutorial not working for me

2016-09-16 Thread John Bickerstaff
what happens if you issue this?  Do you see the field in question in the
results?

http://localhost:8983/solr/gettingstarted/select?wt=json=true=

*:*

On Fri, Sep 16, 2016 at 9:43 AM, Alexandre Rafalovitch 
wrote:

> If your fields are of type string, you have to match them exactly.
>
> But the general queries are probably going against _text_ or similar which
> copyFields context of all other fields without storing but tokenize with
> its own text rules.
>
> Check df parameter in solrconfig.xml or params.json.
>
> Regards,
>Alex
>
> On 16 Sep 2016 10:06 PM, "Pritchett, James" 
> wrote:
>
> I apologize if this is a really stupid question. I followed all
> instructions on installing Tutorial, got data loaded, everything works
> great until I try to query with a field name -- e.g., name:foundation. I
> get zero results from this or any other query which specifies a field name.
> Simple queries return results, and the field names are listed in those
> results correctly. But if I query using names that I know are there and
> values that I know are there, I get nothing.
>
> I figure this must be something basic that is not right about the way
> things have gotten set up, but I am completely blocked at this point. I
> tried blowing it all away and restarting from scratch with no luck. Where
> should I be looking for problems here? I am running this on a MacBook, OS X
> 10.9, latest JDK (1.8).
>
> James
>
> --
>
>
> *James Pritchett*
>
> Leader, Process Redesign and Analysis
>
> __
>
>
> *Learning Ally™*Together It’s Possible
> 20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608
>
> jpritch...@learningally.org
>
> www.LearningAlly.org 
>
> Join us in building a community that helps blind, visually impaired &
> dyslexic students thrive.
>
> Connect with our community: *Facebook*
>  | *Twitter*
>  | *LinkedIn*
>  |
> *Explore1in5*  | *Instagram*
>  | *Sign up for our community
> newsletter*  touch/>
>
> Support us: *Donate*
>  | *Volunteer*
>  volunteers/how-you-can-help/>
>


Re: Tutorial not working for me

2016-09-16 Thread Alexandre Rafalovitch
If your fields are of type string, you have to match them exactly.

But the general queries are probably going against _text_ or similar which
copyFields context of all other fields without storing but tokenize with
its own text rules.

Check df parameter in solrconfig.xml or params.json.

Regards,
   Alex

On 16 Sep 2016 10:06 PM, "Pritchett, James" 
wrote:

I apologize if this is a really stupid question. I followed all
instructions on installing Tutorial, got data loaded, everything works
great until I try to query with a field name -- e.g., name:foundation. I
get zero results from this or any other query which specifies a field name.
Simple queries return results, and the field names are listed in those
results correctly. But if I query using names that I know are there and
values that I know are there, I get nothing.

I figure this must be something basic that is not right about the way
things have gotten set up, but I am completely blocked at this point. I
tried blowing it all away and restarting from scratch with no luck. Where
should I be looking for problems here? I am running this on a MacBook, OS X
10.9, latest JDK (1.8).

James

--


*James Pritchett*

Leader, Process Redesign and Analysis

__


*Learning Ally™*Together It’s Possible
20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608

jpritch...@learningally.org

www.LearningAlly.org 

Join us in building a community that helps blind, visually impaired &
dyslexic students thrive.

Connect with our community: *Facebook*
 | *Twitter*
 | *LinkedIn*
 |
*Explore1in5*  | *Instagram*
 | *Sign up for our community
newsletter* 

Support us: *Donate*
 | *Volunteer*



Re: slow updates/searches

2016-09-16 Thread Erick Erickson
First thing I'd look is whether you're _also_ seeing stop-the-world GC pauses.
In that case there are a number of JVM options that can be tuned

Best,
Erick

On Fri, Sep 16, 2016 at 8:40 AM, Rallavagu  wrote:
> Solr 5.4.1 with embedded jetty single shard - NRT
>
> Looking in logs, noticed that there are high QTimes for Queries and round
> same time high response times for updates. These are not during "commit" or
> "softCommit" but when client application is sending updates. Wondering how
> updates could impact query performance. What are the options for tuning?
> Thanks.


Re: Tutorial not working for me

2016-09-16 Thread Erick Erickson
My bet:
the fields (look in managed_schema or, possibly schema.xml)
has stored="true" and indexed="false" set for the fields
in question.

Pretty much everyone takes a few passes before this really
makes sense. "stored" means you see the results returned,
"indexed" must be true before you can search on something.

Second possibility: You've somehow indexed fields as
"string" type rather than one of the text based fieldTypes.
"string" types are not tokenized, thus a field with
"My dog has fleas" will fail to find "My". It'll even not match
"my dog has fleas" (note capital "M").

The admin UI>>select core>>analysis page will show you
lots of this kind of detail, although I admit it takes a bit to
understand all the info (do un-check the "verbose" button
for the nonce).

Now, all that aside, please show us the field definition for
one of the fields in question and, as John mentions, the exact
query (I'd also ass =true to the results).

Saying you followed the exact instructions somewhere isn't
really helpful. It's likely that there's something innocent-seeming
that was done differently. Giving the information asked for
will help us diagnose what's happening and, perhaps,
improve the docs if we can understand the mis-match.

Best,
Erick

On Fri, Sep 16, 2016 at 8:28 AM, Pritchett, James
 wrote:
> I am following the exact instructions in the tutorial: copy and pasting all
> commands & queries from the tutorial:
> https://lucene.apache.org/solr/quickstart.html. Where it breaks down is
> this one:
>
> http://localhost:8983/solr/gettingstarted/select?wt=json=true=name:foundation
>
> This returns no results. Tried in the web admin view as well, also tried
> various field:value combinations to no avail. Clearly something didn't get
> configured correctly, but I saw no error messages when running all the data
> loads, etc. given in the tutorial.
>
> Sorry to be so clueless, but I don't really have anything to go on for
> troubleshooting besides asking dumb questions.
>
> James
>
> On Fri, Sep 16, 2016 at 11:24 AM, John Bickerstaff > wrote:
>
>> Please share the exact query syntax?
>>
>> Are you using a collection you built or one of the examples?
>>
>> On Fri, Sep 16, 2016 at 9:06 AM, Pritchett, James <
>> jpritch...@learningally.org> wrote:
>>
>> > I apologize if this is a really stupid question. I followed all
>> > instructions on installing Tutorial, got data loaded, everything works
>> > great until I try to query with a field name -- e.g., name:foundation. I
>> > get zero results from this or any other query which specifies a field
>> name.
>> > Simple queries return results, and the field names are listed in those
>> > results correctly. But if I query using names that I know are there and
>> > values that I know are there, I get nothing.
>> >
>> > I figure this must be something basic that is not right about the way
>> > things have gotten set up, but I am completely blocked at this point. I
>> > tried blowing it all away and restarting from scratch with no luck. Where
>> > should I be looking for problems here? I am running this on a MacBook,
>> OS X
>> > 10.9, latest JDK (1.8).
>> >
>> > James
>> >
>> > --
>> >
>> >
>> > *James Pritchett*
>> >
>> > Leader, Process Redesign and Analysis
>> >
>> > __
>> >
>> >
>> > *Learning Ally™*Together It’s Possible
>> > 20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608
>> >
>> > jpritch...@learningally.org
>> >
>> > www.LearningAlly.org 
>> >
>> > Join us in building a community that helps blind, visually impaired &
>> > dyslexic students thrive.
>> >
>> > Connect with our community: *Facebook*
>> >  | *Twitter*
>> >  | *LinkedIn*
>> >  |
>> > *Explore1in5*  | *Instagram*
>> >  | *Sign up for our community
>> > newsletter* > > touch/>
>> >
>> > Support us: *Donate*
>> >  | *Volunteer*
>> > > > volunteers/how-you-can-help/>
>> >
>>
>
>
>
> --
>
>
> *James Pritchett*
>
> Leader, Process Redesign and Analysis
>
> __
>
>
> *Learning Ally™*Together It’s Possible
> 20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608
>
> jpritch...@learningally.org
>
> www.LearningAlly.org 
>
> Join us in building a community that helps blind, visually impaired &
> dyslexic students thrive.
>
> Connect with our community: *Facebook*
>  | *Twitter*
>  | *LinkedIn*
>  |
> *Explore1in5* 

slow updates/searches

2016-09-16 Thread Rallavagu

Solr 5.4.1 with embedded jetty single shard - NRT

Looking in logs, noticed that there are high QTimes for Queries and 
round same time high response times for updates. These are not during 
"commit" or "softCommit" but when client application is sending updates. 
Wondering how updates could impact query performance. What are the 
options for tuning? Thanks.


Re: Tutorial not working for me

2016-09-16 Thread Pritchett, James
I am following the exact instructions in the tutorial: copy and pasting all
commands & queries from the tutorial:
https://lucene.apache.org/solr/quickstart.html. Where it breaks down is
this one:

http://localhost:8983/solr/gettingstarted/select?wt=json=true=name:foundation

This returns no results. Tried in the web admin view as well, also tried
various field:value combinations to no avail. Clearly something didn't get
configured correctly, but I saw no error messages when running all the data
loads, etc. given in the tutorial.

Sorry to be so clueless, but I don't really have anything to go on for
troubleshooting besides asking dumb questions.

James

On Fri, Sep 16, 2016 at 11:24 AM, John Bickerstaff  wrote:

> Please share the exact query syntax?
>
> Are you using a collection you built or one of the examples?
>
> On Fri, Sep 16, 2016 at 9:06 AM, Pritchett, James <
> jpritch...@learningally.org> wrote:
>
> > I apologize if this is a really stupid question. I followed all
> > instructions on installing Tutorial, got data loaded, everything works
> > great until I try to query with a field name -- e.g., name:foundation. I
> > get zero results from this or any other query which specifies a field
> name.
> > Simple queries return results, and the field names are listed in those
> > results correctly. But if I query using names that I know are there and
> > values that I know are there, I get nothing.
> >
> > I figure this must be something basic that is not right about the way
> > things have gotten set up, but I am completely blocked at this point. I
> > tried blowing it all away and restarting from scratch with no luck. Where
> > should I be looking for problems here? I am running this on a MacBook,
> OS X
> > 10.9, latest JDK (1.8).
> >
> > James
> >
> > --
> >
> >
> > *James Pritchett*
> >
> > Leader, Process Redesign and Analysis
> >
> > __
> >
> >
> > *Learning Ally™*Together It’s Possible
> > 20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608
> >
> > jpritch...@learningally.org
> >
> > www.LearningAlly.org 
> >
> > Join us in building a community that helps blind, visually impaired &
> > dyslexic students thrive.
> >
> > Connect with our community: *Facebook*
> >  | *Twitter*
> >  | *LinkedIn*
> >  |
> > *Explore1in5*  | *Instagram*
> >  | *Sign up for our community
> > newsletter*  > touch/>
> >
> > Support us: *Donate*
> >  | *Volunteer*
> >  > volunteers/how-you-can-help/>
> >
>



-- 


*James Pritchett*

Leader, Process Redesign and Analysis

__


*Learning Ally™*Together It’s Possible
20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608

jpritch...@learningally.org

www.LearningAlly.org 

Join us in building a community that helps blind, visually impaired &
dyslexic students thrive.

Connect with our community: *Facebook*
 | *Twitter*
 | *LinkedIn*
 |
*Explore1in5*  | *Instagram*
 | *Sign up for our community
newsletter* 

Support us: *Donate*
 | *Volunteer*



Re: Tutorial not working for me

2016-09-16 Thread John Bickerstaff
Please share the exact query syntax?

Are you using a collection you built or one of the examples?

On Fri, Sep 16, 2016 at 9:06 AM, Pritchett, James <
jpritch...@learningally.org> wrote:

> I apologize if this is a really stupid question. I followed all
> instructions on installing Tutorial, got data loaded, everything works
> great until I try to query with a field name -- e.g., name:foundation. I
> get zero results from this or any other query which specifies a field name.
> Simple queries return results, and the field names are listed in those
> results correctly. But if I query using names that I know are there and
> values that I know are there, I get nothing.
>
> I figure this must be something basic that is not right about the way
> things have gotten set up, but I am completely blocked at this point. I
> tried blowing it all away and restarting from scratch with no luck. Where
> should I be looking for problems here? I am running this on a MacBook, OS X
> 10.9, latest JDK (1.8).
>
> James
>
> --
>
>
> *James Pritchett*
>
> Leader, Process Redesign and Analysis
>
> __
>
>
> *Learning Ally™*Together It’s Possible
> 20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608
>
> jpritch...@learningally.org
>
> www.LearningAlly.org 
>
> Join us in building a community that helps blind, visually impaired &
> dyslexic students thrive.
>
> Connect with our community: *Facebook*
>  | *Twitter*
>  | *LinkedIn*
>  |
> *Explore1in5*  | *Instagram*
>  | *Sign up for our community
> newsletter*  touch/>
>
> Support us: *Donate*
>  | *Volunteer*
>  volunteers/how-you-can-help/>
>


Re: [Rerank Query] Distributed search + pagination

2016-09-16 Thread Alessandro Benedetti
In addition to that, I think the only way to solve this is to rely on the
aggregator node to actually re-rank after having aggregated.

Cheer

On Fri, Sep 9, 2016 at 11:48 PM, Alessandro Benedetti  wrote:

> Let me explain further,
> let's assume a simple case when we have 2 shards.
> ReRankDocs =10 , rows=10 .
>
> Correct me if I am wrong Joel,
> What we would like :
> 1 page : top 10 re-scored
> 2 page: remaining 10 re-scored
> From page 3 the original scored docs.
> This is what is happening in a single sol instance if we put reRankDocs to
> 20.
>
> Let's see with sharing :
> To get the first page we get top 10 ( re-scored) from shard1 and top 10
> reranked for shard 2.
> Then the merged top 10 ( re-scored) will be calculated, and that is the
> page 1.
>
> But when we require the page 2, which means we additionally ask now :
> 20 docs to shard1, 10 re-scored and 10 not.
> 20 docs to shard2, 10 re-scored and 10 not.
> At this point we have 40 docs to merge and rank..
> The docs with the original score can go at any position ( not necessarily
> the last 20)
> In the page 2 we can find potentially docs with the original score.
> This is even more likely if the scores are on differente scales (e.g. the
> re-scored 0100 ) .
>
> Am I right ?
> Did I make any wrong assumption so far ?
>
> Cheers
>
>
> On Fri, Sep 9, 2016 at 7:47 PM, Joel Bernstein  wrote:
>
>> I'm not understanding where the inconsistency comes into play.
>>
>> The re-ranking occurs on the shards. The aggregator node will be sent some
>> docs that have been re-scored and others that are not. But the sorting
>> should be the same as someone pages through the result set.
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Fri, Sep 9, 2016 at 9:28 AM, Alessandro Benedetti <
>> abenede...@apache.org>
>> wrote:
>>
>> > Hi guys,
>> > was just experimenting some reranker with really low number of rerank
>> docs
>> > ( 10= pageSize) .
>> > Let's focus on the distributed enviroment and  the manual sharding
>> > approach.
>> >
>> > Currently what happens is that the reranking task is delivered by the
>> > shards, they rescore the docs and then send them back to the aggregator
>> > node.
>> >
>> > If you want to rerank only few docs ( leaving the others with the
>> original
>> > score following), this can be done in a single Solr instance ( the
>> howmany
>> > logic manages that in the reranker) .
>> >
>> > What happens when you move to a distributed environment ?
>> > The aggregator will aggregate both rescored and original scored
>> documents,
>> > making the final ranking inconsistent.
>> > In the other hand if we make the rarankingDocs threshold dynamic ( to
>> adapt
>> > to start+rows) we can incur in the very annoying issue of having a
>> document
>> > sliding through the pages ( visible in the first page , then appearing
>> > again in the third ect ect).
>> >
>> > Any thought ?
>> >
>> > Cheers
>> >
>> > --
>> > --
>> >
>> > Benedetti Alessandro
>> > Visiting card : http://about.me/alessandro_benedetti
>> >
>> > "Tyger, tyger burning bright
>> > In the forests of the night,
>> > What immortal hand or eye
>> > Could frame thy fearful symmetry?"
>> >
>> > William Blake - Songs of Experience -1794 England
>> >
>>
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Best way to generate multivalue fields from streaming API

2016-09-16 Thread Mike Thomsen
Read this article and thought it could be interesting as a way to do
ingestion:

https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1

Example from the article:

daemon(id="12345",

 runInterval="6",

 update(users,

 batchSize=10,

 jdbc(connection="jdbc:mysql://localhost/users?user=root=solr",
sql="SELECT id, name FROM users", sort="id asc",
driver="com.mysql.jdbc.Driver")

)

What's the best way to handle a multivalue field using this API? Is
there a way to tokenize something returned in a database field?

Thanks,

Mike


Tutorial not working for me

2016-09-16 Thread Pritchett, James
I apologize if this is a really stupid question. I followed all
instructions on installing Tutorial, got data loaded, everything works
great until I try to query with a field name -- e.g., name:foundation. I
get zero results from this or any other query which specifies a field name.
Simple queries return results, and the field names are listed in those
results correctly. But if I query using names that I know are there and
values that I know are there, I get nothing.

I figure this must be something basic that is not right about the way
things have gotten set up, but I am completely blocked at this point. I
tried blowing it all away and restarting from scratch with no luck. Where
should I be looking for problems here? I am running this on a MacBook, OS X
10.9, latest JDK (1.8).

James

-- 


*James Pritchett*

Leader, Process Redesign and Analysis

__


*Learning Ally™*Together It’s Possible
20 Roszel Road | Princeton, NJ 08540 | Office: 609.243.7608

jpritch...@learningally.org

www.LearningAlly.org 

Join us in building a community that helps blind, visually impaired &
dyslexic students thrive.

Connect with our community: *Facebook*
 | *Twitter*
 | *LinkedIn*
 |
*Explore1in5*  | *Instagram*
 | *Sign up for our community
newsletter* 

Support us: *Donate*
 | *Volunteer*



Re: Can I get a document from its Lucene ID?

2016-09-16 Thread Yonik Seeley
On Fri, Sep 16, 2016 at 9:23 AM, Alexandre Rafalovitch
 wrote:
> Because I get this error message and not sure what the next step is:
> "child query must only match non-parent docs, but parent docID=38200
> matched childScorer=class
> org.apache.lucene.search.DisjunctionSumScorer"
>
> I understand that 38200 is transient and all that, but can I get a
> document by it right now? Via a Solr query (and not - say - Luke).
>
> I know I could display it with [docid] transformer and I could sort by
> _docid_ secret field, but I can't see a way to search or limit by that
>  id.

I can't think of a way either... it might be useful to make _docid_
more of a legit pseudo-field (returnable, usable in a function query,
etc)

-Yonik


Re: Miserable Experience Using Solr. Again.

2016-09-16 Thread Alexandre Rafalovitch
On 16 September 2016 at 18:30, Stefan Matheis  wrote:
>> … choice between better docs and better UI, I’ll choose a better UI every 
>> time
>
> Aaron, you (as well as all others) are more than welcome to help out - no 
> matter what you do / how you do it.
>
> While we’d obviously would love to get some more hands helping out with the 
> coding parts - improving the UI in terms of wording (as you just pointed out) 
> does help equally as much, if not even more.

And just to back it up with facts:

Let's run this query against my ugly-but-interesting Solr-to-Github
collection ( https://github.com/arafalov/git-to-solr):

http://localhost:8983/solr/git/select?facet.field=committer=10=on=-message:Merge=commitTime:[NOW/YEAR-5YEAR%20TO%20*]=on={!parent%20which=type:commit}fileExt:(css%20js)=10=json

That should be (approximately): "Give me breakdown on committer names
that committed more than 10 times something that includes a css or js
file. Limit this to last 5 years". And you'd get:


 "facet_counts":{
"facet_queries":{},
"facet_fields":{
  "committer":[
"Stefan Matheis",97,
"Upayavira",38,
"Ryan McKinley",33,
"Erik Hatcher",13,
"Jan Høydahl",12,
"Erick Erickson",11,
"Steven Rowe",10]},
...

That's it... In last 5 years! UI design is hard. Especially when the
core developer team are the hard core backend guys. Solr only got the
Reference Guide (as opposed to WIKI) because LucidWorks did the bulk
of initial (and possibly ongoing) work. The pretty website was also
sponsored. We don't have in-the-team UI/HTML/CSS/JS expertise.

However, if you don't mind installing a third party project, Cloudera
Hue is a free (open-source if I remember correctly) interface to a
bunch of big-data components, including Solr. They do nice videos too:
http://gethue.com/search-dashboards/

Or there is a commercial one from LucidWorks themselves.

A lot more people know how to contribute to the documentation and
workflows around that are better. And hopefully will be better yet
again once the migration away from Confluence happens.

Or perhaps somebody will sponsor a full-blown front-end developer to
help out. Of course, _they_ not being a part of the current team would
need to figure out what is possible and what APIs to use. So
documentation would come useful there too.

Regards,
   Alex.


Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


Re: Miserable Experience Using Solr. Again.

2016-09-16 Thread Shawn Heisey
Responses inline.  More potentially flimsy excuses coming your way.

On 9/15/2016 9:56 PM, Aaron Greenspan wrote:
> My two cents: I’m glad to see the discussion over improved documentation, but 
> if you give me a choice between better docs and better UI, I’ll choose a 
> better UI every time. If contributors are going to spend real time on the 
> concerns raised in this thread, spend the time on making the software better 
> to the point where more docs are unnecessary. All sorts of things could 
> improve that would make the product far more intuitive (and I know, there are 
> probably JIRA entries on most of these already…).

The UI is not really intended to be the way that Solr gets accessed. 
It's mostly just a way for an admin to peer into what Solr is doing.  At
this time, it's not really a good tool for configuration.  The admin UI
in 1.x and 3.x only had one or two ways to change the system ... and one
of those ways was enable/disable on the ping handler.  Everything else
the UI did back then was informational.

In newer versions of Solr, the UI does have some capability to make
changes to the system state -- which it does by accessing the HTTP API,
same as any program you would write that uses Solr.

> - The psuedo-frames in the web UI are the source of all kinds of problems, 
> with lots of weird horizontal scrolling I’ve noticed over the years. It makes 
> the Logging screen in particular infuriating to use. When I click on certain 
> log entries an arbitrary-seeming "false" flips to "true" under the "WARN" 
> statement in the Level column. But on other log entries, it all just goes 
> haywire all over the screen because it’s too big both horizontally and 
> vertically, and then re-condenses as though I’d never clicked, as I mentioned 
> before.

If the UI problems you've found are filed as bugs (or as a single bug
listing the problems), we can take it from there.  My free time is
pretty small, or I would do it myself.

Checking the actual logfile has always been a better option than the
Logging tab in the UI.  The info in ERROR messages is typically too
large for effective display in the UI, and the UI excludes messages that
are at a severity of INFO or less.  The default level for the logfile is
INFO, and is typically very verbose.

> - The top menu on the left is in plain English. The core menu on the bottom 
> is written as though it’s being viewed by a person who only speaks UNIX. For 
> example, there is no space between "Data" and "Import" in "DataImport" and 
> "Segments info" could just be "Segments". Is "Plugins / Stats" two menus in 
> one?

The Solr feature that this accesses is contained in a class named
DataImportHandler.  The most commonly configured URL path for the
handler is "/dataimport".  Is it more important to be grammatically
correct, or connect the wording in the admin UI with what the user
actually sees in solrconfig.xml?  You're probably right that it should
have that space, so it doesn't grate on the nerves of those who like
things to be correct.

On "Segments Info" ... brevity counts for a lot in an interface.  That
sounds like a good change.

> - "Ping" in the menu takes you nowhere in particular and shouldn’t really be 
> a menu item. It should be part of the main dashboard with all of the other 
> tech stats (which I do like) or a menu called "Status". (Why would one core 
> ping faster than another anyway? If this is really for "cloud" installations 
> where cores can be split up on different servers, why am I seeing it when 
> everything is local and immediate?)

Good point.  I'm not sure why the time for the ping is so prominently
displayed.  Ping isn't really about speed -- it's about whether or not
the server is up and functional.  It's also a legacy feature that sort
of works with Cloud, but isn't really aware of Cloud.  Plenty of room
for improvement in the UI, in the ping feature itself, and the docs.

> - On the Data Import page, the expandable icons are [-] when they’re expanded 
> and still [-] when they’re collapsed. Extremely confusing.

That's definitely a bug.  The priority of that bug might be debatable.

> - The Data Import UI makes no mention anywhere of the ability to import from 
> MySQL, which is 99% of what I want to do with this product. It doesn’t tell 
> me how to set up the MySQL connector, doesn’t give me a button that turns it 
> on in some modular fashion, doesn’t tell me if the server connection is 
> successful, doesn’t let me easily enter or edit credentials, doesn’t let me 
> edit my queries anywhere, and doesn’t let me test out a new query and see how 
> it might fit into the Solr schema. These deficiencies are presumably also 
> true for any database data source, e.g. Postgres/DB2/ODBC/whatever—which also 
> are not listed, were I curious to know what Solr can do just by looking at 
> the product itself.
>
> - Nor does the Data Import UI have another section for picking a folder on 
> the filesystem that might contain PDFs I 

Can I get a document from its Lucene ID?

2016-09-16 Thread Alexandre Rafalovitch
Because I get this error message and not sure what the next step is:
"child query must only match non-parent docs, but parent docID=38200
matched childScorer=class
org.apache.lucene.search.DisjunctionSumScorer"

I understand that 38200 is transient and all that, but can I get a
document by it right now? Via a Solr query (and not - say - Luke).

I know I could display it with [docid] transformer and I could sort by
_docid_ secret field, but I can't see a way to search or limit by that
id. Is there something obvious I missed? Or is the error like in that
jokes about Sherlock Holmes and the air balloon?

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


Re: Miserable Experience Using Solr. Again.

2016-09-16 Thread Stefan Matheis
> … choice between better docs and better UI, I’ll choose a better UI every time

Aaron, you (as well as all others) are more than welcome to help out - no 
matter what you do / how you do it.

While we’d obviously would love to get some more hands helping out with the 
coding parts - improving the UI in terms of wording (as you just pointed out) 
does help equally as much, if not even more.

When i've started this whole new Admin UI thing, my intensions were primarily 
to make it look like it was from recent times and not a century ago. Afterwards 
Upayavira joined in to upgrade the frontend architecture to the current state 
of art - which by now didn’t help as much as we’d expected for others to 
contribute. I’m running out of ideas what else could help. We are here in 
backend country and not that attractive for capable frontend developers.

We both came up with whatever we could - neither of us is a designer, at most a 
random guy with two eyes. In certain situations i’m the same as you, i’m the 
first person to critize this and that - i often see what others could improve, 
but as often i do not realize that i could do the very same for projects where 
i’m involved. and that’s for a variety of reasons.

To sum it up: if you (again, as well as others) do not speak up - our hands are 
tied. of course it’s easier to report a specific bug that gets fixed, but 
nobody said it’s the only thing you can do. as helpful and needed it is to have 
people working on the documentation instead of contributing code - as important 
are suggestions on the ui itself. you don’t have to actually do it, especially 
not if it’s an area where you can’t help .. but you are one of many using it 
day in, day out. 

And that goes for all the things .. wordings, usability as well as and 
especially design. the ui still looks like (actually is) my first 
work-in-progress draft from years ago - and the reason therefore is certainly 
not because we all love it to death and refuse the change the smallest bits.

those were a bit more than my two cents, but they needed to get off my chest.

-Stefan


On September 16, 2016 at 5:56:34 AM, Aaron Greenspan 
(aaron.greens...@plainsite.org) wrote:
> Hi again,
>  
> My two cents: I’m glad to see the discussion over improved documentation, but 
> if you give  
> me a choice between better docs and better UI, I’ll choose a better UI every 
> time. If contributors  
> are going to spend real time on the concerns raised in this thread, spend the 
> time on making  
> the software better to the point where more docs are unnecessary. All sorts 
> of things  
> could improve that would make the product far more intuitive (and I know, 
> there are probably  
> JIRA entries on most of these already…).
>  
> - The psuedo-frames in the web UI are the source of all kinds of problems, 
> with lots of weird  
> horizontal scrolling I’ve noticed over the years. It makes the Logging screen 
> in particular  
> infuriating to use. When I click on certain log entries an arbitrary-seeming 
> "false"  
> flips to "true" under the "WARN" statement in the Level column. But on other 
> log entries,  
> it all just goes haywire all over the screen because it’s too big both 
> horizontally and  
> vertically, and then re-condenses as though I’d never clicked, as I mentioned 
> before.  
>  
> - The top menu on the left is in plain English. The core menu on the bottom 
> is written as though  
> it’s being viewed by a person who only speaks UNIX. For example, there is no 
> space between  
> "Data" and "Import" in "DataImport" and "Segments info" could just be 
> "Segments". Is  
> "Plugins / Stats" two menus in one?
>  
> - "Ping" in the menu takes you nowhere in particular and shouldn’t really be 
> a menu item.  
> It should be part of the main dashboard with all of the other tech stats 
> (which I do like)  
> or a menu called "Status". (Why would one core ping faster than another 
> anyway? If this  
> is really for "cloud" installations where cores can be split up on different 
> servers,  
> why am I seeing it when everything is local and immediate?)
>  
> - On the Data Import page, the expandable icons are [-] when they’re expanded 
> and still  
> [-] when they’re collapsed. Extremely confusing.
>  
> - The Data Import UI makes no mention anywhere of the ability to import from 
> MySQL, which  
> is 99% of what I want to do with this product. It doesn’t tell me how to set 
> up the MySQL connector,  
> doesn’t give me a button that turns it on in some modular fashion, doesn’t 
> tell me if the  
> server connection is successful, doesn’t let me easily enter or edit 
> credentials, doesn’t  
> let me edit my queries anywhere, and doesn’t let me test out a new query and 
> see how it might  
> fit into the Solr schema. These deficiencies are presumably also true for any 
> database  
> data source, e.g. Postgres/DB2/ODBC/whatever—which also are not listed, were 
> I curious  
> to know what Solr can do just 

Re: help with field definition

2016-09-16 Thread Emir Arnautovic

Hi,

I missed that you already did define field and you are having troubles 
with query (did not read stackoverflow). Added answer there, but just in 
case somebody else is having similar troubles, issue is how query is 
written - space has to be escaped:


  q=Justin\ Bieber

Regards,
Emir

On 13.09.2016 23:27, Gandham, Satya wrote:

HI,

   I need help with defining a field ‘singerName’ with the right 
tokenizers and filters such that it gives me the below described behavior:

I have a few documents as given below:

Doc 1
   singerName: Justin Beiber
Doc 2:
   singerName: Justin Timberlake
…


Below is the list of quries and the corresponding matches:

Query 1: “My fav artist Justin Beiber is very impressive”
Docs Matched : Doc1

Query 2: “I have a Justin Timberlake poster on my wall”
Docs Matched: Doc2

Query 3: “The name Bieber Justin is unique”
Docs Matched: None

Query 4: “Timberlake is a lake of timber..?”
Docs Matched: None.

I have this described a bit more detailed here: 
http://stackoverflow.com/questions/39399321/solr-shingle-query-matching-keyword-tokenized-field

I’d appreciate any help in addressing this problem.

Thanks !!



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/