Re: /select results different between 5.4 and 6.1

2016-08-19 Thread John Bickerstaff
Many thanks.

On Fri, Aug 19, 2016 at 4:22 PM, Anshum Gupta 
wrote:

> The default similarity changed from TF-IDF to BM25 in 6.0.
>
> On Fri, Aug 19, 2016 at 3:00 PM John Bickerstaff  >
> wrote:
>
> > Bump!
> >
> > TL;DR Question: Are scores (and debug output) *expected* to be different
> > between 5.4 and 6.1?
> >
> > On Thu, Aug 18, 2016 at 2:44 PM, John Bickerstaff <
> > j...@johnbickerstaff.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > TL:DR -
> > > Is it expected that the /select endpoint would produce different
> > > scores/result order between versions 5.4 and 6.1?
> > >
> > >
> > > (I'm aware that it's certainly possible I've done something different
> to
> > > these environments, although at this point I can't see any difference
> in
> > > configs etc... and I used a very simple search against /select to test
> > this)
> > >
> > > == Detail ==
> > >
> > > I'm currently seeing different scoring and different result order when
> I
> > > compare Solr results in the Admin console for a 5.4 and 6.1
> environment.
> > >
> > > I'm using the /select endpoint to try to avoid any difference in
> > > configuration.  To the best of my knowledge (and reading) I haven't
> ever
> > > modified the xml for that endpoint.
> > >
> > > As I was looking into it, I saw that the debug output looks quite
> > > different in 6.1...
> > >
> > > Any advice, including "You must have broken it yourself, that's
> > > impossible" is much appreciated.
> > >
> > >
> > >
> > > Here's debug from the "old" 5.4 SolrCloud environment.  The id's are a
> > > pain to read, but not only am I getting different scores, I'm getting
> > > different docs (or docs in a clearly different order)
> > >
> > > "debug": { "rawquerystring": "chiari", "querystring": "chiari", "
> > > parsedquery": "text:chiari", "parsedquery_toString": "text:chiari", "
> > > explain": { "d9644f86-5fe2-4a9f-8517-545e2cde0b64": "\n4.3581347 =
> > > weight(text:chiari in 26783) [ClassicSimilarity], result of:\n
> 4.3581347
> > =
> > > fieldWeight in 26783, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0
> > > = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> > > fieldNorm(doc=26783)\n", "1347f707-6fdd-4864-b9dd-6d3e7cc32bf5":
> > "\n4.3581347
> > > = weight(text:chiari in 26792) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 26792, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=26792)\n", "d01c32ad-e29d-4b65-9930-f8a6844a2613":
> > "\n4.3581347
> > > = weight(text:chiari in 27028) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27028, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27028)\n", "0c5a4be7-1162-4b1a-ab83-4b48a690fc3a":
> > "\n4.3581347
> > > = weight(text:chiari in 27029) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27029, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27029)\n", "e1cb441d-9d60-482d-956b-3fbc964a17c1":
> > "\n4.3581347
> > > = weight(text:chiari in 27042) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27042, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27042)\n", "f87951f1-e163-4f17-a628-904b9df0c609":
> > "\n4.3581347
> > > = weight(text:chiari in 27043) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27043, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27043)\n", "caaa7ca1-34cb-44a8-8dd9-12c909db8c2d":
> > "\n4.3581347
> > > = weight(text:chiari in 27044) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27044, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27044)\n", "ada7a87e-725a-4533-b72e-3817af4c7179":
> > "\n4.3581347
> > > = weight(text:chiari in 27055) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27055, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27055)\n", "ac6d47fd-9a59-47d6-8cfb-11b34c7ded54":
> > "\n4.3581347
> > > = weight(text:chiari in 27056) [ClassicSimilarity], result of:\n
> > 4.3581347
> > > = fieldWeight in 27056, product of:\n 1.0 = tf(freq=1.0), with freq
> of:\n
> > > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> > 0.625 =
> > > fieldNorm(doc=27056)\n", 

Re: help with DIH transformer to add a suffix to column names

2016-08-19 Thread Alexandre Rafalovitch
Can you give an example of what SQL column name and what Solr field name
you want correspondingly.

Because 'name_*' is not a valid field name.

Also, why specifically you are doing this.

Regards,
Alex

On 20 Aug 2016 6:04 AM, "Wendy"  wrote:

Hi,How can I append a suffix  _* to all column names from a mysql database.I
am working on a project index data from mysql . I would like to use dynamic
field to dynamically index fields without specifying each field/column
names. I have been tried DIH customer transformer to append a suffix to
column name. But no error, no data. Does anyone has a good working
example?Thanks!



--
View this message in context: http://lucene.472066.n3.
nabble.com/help-with-DIH-transformer-to-add-a-suffix-
to-column-names-tp4292448.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: /select results different between 5.4 and 6.1

2016-08-19 Thread Anshum Gupta
The default similarity changed from TF-IDF to BM25 in 6.0.

On Fri, Aug 19, 2016 at 3:00 PM John Bickerstaff 
wrote:

> Bump!
>
> TL;DR Question: Are scores (and debug output) *expected* to be different
> between 5.4 and 6.1?
>
> On Thu, Aug 18, 2016 at 2:44 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > Hi all,
> >
> > TL:DR -
> > Is it expected that the /select endpoint would produce different
> > scores/result order between versions 5.4 and 6.1?
> >
> >
> > (I'm aware that it's certainly possible I've done something different to
> > these environments, although at this point I can't see any difference in
> > configs etc... and I used a very simple search against /select to test
> this)
> >
> > == Detail ==
> >
> > I'm currently seeing different scoring and different result order when I
> > compare Solr results in the Admin console for a 5.4 and 6.1 environment.
> >
> > I'm using the /select endpoint to try to avoid any difference in
> > configuration.  To the best of my knowledge (and reading) I haven't ever
> > modified the xml for that endpoint.
> >
> > As I was looking into it, I saw that the debug output looks quite
> > different in 6.1...
> >
> > Any advice, including "You must have broken it yourself, that's
> > impossible" is much appreciated.
> >
> >
> >
> > Here's debug from the "old" 5.4 SolrCloud environment.  The id's are a
> > pain to read, but not only am I getting different scores, I'm getting
> > different docs (or docs in a clearly different order)
> >
> > "debug": { "rawquerystring": "chiari", "querystring": "chiari", "
> > parsedquery": "text:chiari", "parsedquery_toString": "text:chiari", "
> > explain": { "d9644f86-5fe2-4a9f-8517-545e2cde0b64": "\n4.3581347 =
> > weight(text:chiari in 26783) [ClassicSimilarity], result of:\n 4.3581347
> =
> > fieldWeight in 26783, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0
> > = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> > fieldNorm(doc=26783)\n", "1347f707-6fdd-4864-b9dd-6d3e7cc32bf5":
> "\n4.3581347
> > = weight(text:chiari in 26792) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 26792, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=26792)\n", "d01c32ad-e29d-4b65-9930-f8a6844a2613":
> "\n4.3581347
> > = weight(text:chiari in 27028) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27028, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27028)\n", "0c5a4be7-1162-4b1a-ab83-4b48a690fc3a":
> "\n4.3581347
> > = weight(text:chiari in 27029) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27029, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27029)\n", "e1cb441d-9d60-482d-956b-3fbc964a17c1":
> "\n4.3581347
> > = weight(text:chiari in 27042) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27042, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27042)\n", "f87951f1-e163-4f17-a628-904b9df0c609":
> "\n4.3581347
> > = weight(text:chiari in 27043) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27043, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27043)\n", "caaa7ca1-34cb-44a8-8dd9-12c909db8c2d":
> "\n4.3581347
> > = weight(text:chiari in 27044) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27044, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27044)\n", "ada7a87e-725a-4533-b72e-3817af4c7179":
> "\n4.3581347
> > = weight(text:chiari in 27055) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27055, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27055)\n", "ac6d47fd-9a59-47d6-8cfb-11b34c7ded54":
> "\n4.3581347
> > = weight(text:chiari in 27056) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 27056, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=27056)\n", "4aaa7697-b26a-4bea-ba4e-70d18ea649f0":
> "\n4.3581347
> > = weight(text:chiari in 62240) [ClassicSimilarity], result of:\n
> 4.3581347
> > = fieldWeight in 62240, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> > 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n
> 0.625 =
> > fieldNorm(doc=62240)\n" }, "QParser": "LuceneQParser", 

Re: /select results different between 5.4 and 6.1

2016-08-19 Thread John Bickerstaff
Bump!

TL;DR Question: Are scores (and debug output) *expected* to be different
between 5.4 and 6.1?

On Thu, Aug 18, 2016 at 2:44 PM, John Bickerstaff 
wrote:

> Hi all,
>
> TL:DR -
> Is it expected that the /select endpoint would produce different
> scores/result order between versions 5.4 and 6.1?
>
>
> (I'm aware that it's certainly possible I've done something different to
> these environments, although at this point I can't see any difference in
> configs etc... and I used a very simple search against /select to test this)
>
> == Detail ==
>
> I'm currently seeing different scoring and different result order when I
> compare Solr results in the Admin console for a 5.4 and 6.1 environment.
>
> I'm using the /select endpoint to try to avoid any difference in
> configuration.  To the best of my knowledge (and reading) I haven't ever
> modified the xml for that endpoint.
>
> As I was looking into it, I saw that the debug output looks quite
> different in 6.1...
>
> Any advice, including "You must have broken it yourself, that's
> impossible" is much appreciated.
>
>
>
> Here's debug from the "old" 5.4 SolrCloud environment.  The id's are a
> pain to read, but not only am I getting different scores, I'm getting
> different docs (or docs in a clearly different order)
>
> "debug": { "rawquerystring": "chiari", "querystring": "chiari", "
> parsedquery": "text:chiari", "parsedquery_toString": "text:chiari", "
> explain": { "d9644f86-5fe2-4a9f-8517-545e2cde0b64": "\n4.3581347 =
> weight(text:chiari in 26783) [ClassicSimilarity], result of:\n 4.3581347 =
> fieldWeight in 26783, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0
> = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=26783)\n", "1347f707-6fdd-4864-b9dd-6d3e7cc32bf5": "\n4.3581347
> = weight(text:chiari in 26792) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 26792, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=26792)\n", "d01c32ad-e29d-4b65-9930-f8a6844a2613": "\n4.3581347
> = weight(text:chiari in 27028) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27028, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27028)\n", "0c5a4be7-1162-4b1a-ab83-4b48a690fc3a": "\n4.3581347
> = weight(text:chiari in 27029) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27029, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27029)\n", "e1cb441d-9d60-482d-956b-3fbc964a17c1": "\n4.3581347
> = weight(text:chiari in 27042) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27042, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27042)\n", "f87951f1-e163-4f17-a628-904b9df0c609": "\n4.3581347
> = weight(text:chiari in 27043) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27043, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27043)\n", "caaa7ca1-34cb-44a8-8dd9-12c909db8c2d": "\n4.3581347
> = weight(text:chiari in 27044) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27044, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27044)\n", "ada7a87e-725a-4533-b72e-3817af4c7179": "\n4.3581347
> = weight(text:chiari in 27055) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27055, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27055)\n", "ac6d47fd-9a59-47d6-8cfb-11b34c7ded54": "\n4.3581347
> = weight(text:chiari in 27056) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 27056, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=27056)\n", "4aaa7697-b26a-4bea-ba4e-70d18ea649f0": "\n4.3581347
> = weight(text:chiari in 62240) [ClassicSimilarity], result of:\n 4.3581347
> = fieldWeight in 62240, product of:\n 1.0 = tf(freq=1.0), with freq of:\n
> 1.0 = termFreq=1.0\n 6.9730153 = idf(docFreq=281, maxDocs=110738)\n 0.625 =
> fieldNorm(doc=62240)\n" }, "QParser": "LuceneQParser", "timing": { "time":
> 2, "prepare": { "time": 0, "query": { "time": 0 },
>
> ... and here's the same from the Solr Cloud 6.0 environment
>
> "debug":{ "rawquerystring":"chiari", "querystring":"chiari", "parsedquery
> ":"text:chiari", "parsedquery_toString":"text:chiari", "explain":{ "
> 85249c23-ef68-4276-9ef7-48c290033993":"\n9.735645 = weight(text:chiari in
> 106960) [], 

Re: Need to understand solr merging and commit relationship

2016-08-19 Thread Shawn Heisey
On 8/16/2016 11:47 AM, kshitij tyagi wrote:
> I need to understand clearly that is there any relationship between solr
> merging and solr commit?
>
> If there is then what is it?
>
> Also i need to understand how both of these affect indexing speed on the
> core?

Whenever a new segment is written, the merge policy is checked to see
whether a merge is needed.  If it is needed, then the merge is scheduled.

A commit operation can (and frequently does) write a new segment, but
that is not the only thing that can write (flush) new segments.  When
the indexing RAM buffer fills up, a segment will be flushed, even
without a commit.

When paired with the default NRT Directory implementation, soft commits
change the dynamics slightly, but not the way things generally operate. 
Soft commits are capable of flushing the latest segment(s) to memory,
instead of the disk, but only if they are quite small.

I would not expect commits to *directly* affect indexing speed unless
you are doing commits extremely frequently.  Commits might indirectly
affect indexing speed if they trigger a large merge.

Merging can cause issues with indexing speed, even if it's happening in
a different Solr core on the same machine.  This is because the system
resources (I/O bandwidth, memory, CPU) required for a merge are also
required to write a new segment.  Also, because flushing a new segment
is effectively the same operation as the writing part of a merge, if too
many merges are scheduled at once on a core, indexing on that core can
stop entirely until the number of scheduled merges drops.

Merging can also cause issues with query speed, if there is not
sufficient memory available to the OS for effective disk caching.

Thanks,
Shawn



help with DIH transformer to add a suffix to column names

2016-08-19 Thread Wendy
Hi,How can I append a suffix  _* to all column names from a mysql database.I
am working on a project index data from mysql . I would like to use dynamic
field to dynamically index fields without specifying each field/column
names. I have been tried DIH customer transformer to append a suffix to
column name. But no error, no data. Does anyone has a good working
example?Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/help-with-DIH-transformer-to-add-a-suffix-to-column-names-tp4292448.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to recovery node

2016-08-19 Thread Erick Erickson
You removed the node by deleting directories. Therefore
Zookeeper still has a record for shard2_replica2
and shard1_replica2. When you added a new replica, it
had to choose a new name and those were already
taken, thus the "replica3" parts.

And that's the same explanation for why you have two "down"
nodes. The ZK state (see adminUI>>cloud>>tree>>cloudexample>>state.json)
still has the information for replica2 for both shards. Solr can't
magically know you removed the nodes permanently, maybe you
just shut down the Solr instance and it'll come back sometime

So use the DELETEREPLICA command to clean up the down
replicas and you should be fine.

Best,
Erick

On Fri, Aug 19, 2016 at 8:36 AM, Jason  wrote:
> Hi, Erick
> I just know that ADDREPLICA command is for it.
> At that time writting my question, I just tried command
> http://localhost:8984/solr/admin/collections?action=ADDREPLICA=cloudexample=shard1=localhost:8984_solr.
> But it returned an error.
> Now I know what the problem is.
> Node name parameter needs actual IP.
> So now node2 with 2 replicas is running well.
> But I have another questions.
> Newly created Node2 has two shards as replica.
> Its names are 'cloudexample_shard1_replica3' and
> 'cloudexample_shard2_replica3'.
> Why isn't it 'cloudexample_shard1_replica2' and
> 'cloudexample_shard2_replica2'?
> And cloud graph likes below.
>
> cloudexample -- shard1 -- 10.3.4.20:8983 (Leader,Aactive)
>10.3.4.20:8984 (Down)
>10.3.4.20:8984 (Active)
>  -- shard2 -- 10.3.4.20:8983 (Leader,Aactive)
>10.3.4.20:8984 (Down)
>10.3.4.20:8984 (Active)
>
> As shown the graph, two replicas turned out down.
> What happen? How should I do?
>
>
>
>
> Erick Erickson wrote
>> Let Solr do it for you. In this case use the collections API
>> ADDREPLICA command. You tell it what collection and
>> shard you want replicated, it'll create the right structure in the
>> right place and synch the index.
>>
>> See:
>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica
>>
>> If you require exact placement of the replica, specify the "node"
>> parameter.
>>
>> Best,
>> Erick
>>
>> On Fri, Aug 19, 2016 at 2:19 AM, Jason 
>
>> hialooha@
>
>>  wrote:
>>> I'm new to solrcloud (ver.6.1). So, I'm in tutorial.
>>> I tried to force removing specific node because I was wondering how to
>>> recover a node.
>>> But I can't solve the problem.
>>>
>>> Below is my scenario.
>>>
 bin/solr -e cloud
>>> # make 2 nodes with 8983, 8984 port
>>> # collection structure is below
>>> # example/cloud/node1 (with 8983)
>>> # /cloudexample_shard1_replica2
>>> # /cloudexample_shard2_replica2
>>> # example/cloud/node2 (with 8984)
>>> # /cloudexample_shard1_replica1
>>> # /cloudexample_shard2_replica1
>>>
 java -Dc=gettingstarted -jar post.jar *.xml
>>> # indexing and searching is ok.
>>>
 bin/solr stop -p 8984
>>> # force to shutdown node2 and remove example/cloud/node2
>>>
 mkdir -p example/cloud/new_node2/solr
 cp example/cloud/node1/solr/solr.xml example/cloud/new_node2/solr
 bin/solr start -c -p 8984 -z localhost:9983 -x
 example/cloud/new_node2/solr
>>>
>>> everything is good so far, but after this how should I recover node2 same
>>> as
>>> before remove?
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/How-to-recovery-node-tp4292354.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-recovery-node-or-replica-in-solrcloud-tp4292354p4292417.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Gradle Plugin

2016-08-19 Thread Gus Heck
The other day, I finally got around to automating solr config deployment
with gradle, so now I can have this workflow:

   - Run a gradle task to get the current config from zookeeper,
   - See any changes not checked in (my ide happily lights these up if I
   had the files under source control)
   - Modify whatever I wanted to modify, (i.e. tweak solrconfg.xml)
   - Run a gradle task to distribute my edits back to the server.
   - Check in edited files without any further fuss if I like the result.

At this point it's dead simple, but already seems to be somewhat useful, so
I've released it here:

https://github.com/nsoft/solr-gradle

Comments and Suggestions welcome. Feature requests and bug reports also
welcome, especially alongside a pull request :).

-Gus

-- 
http://www.the111shift.com


Re: How to recovery node

2016-08-19 Thread Jason
Hi, Erick
I just know that ADDREPLICA command is for it.
At that time writting my question, I just tried command
http://localhost:8984/solr/admin/collections?action=ADDREPLICA=cloudexample=shard1=localhost:8984_solr.
But it returned an error.
Now I know what the problem is.
Node name parameter needs actual IP.
So now node2 with 2 replicas is running well.
But I have another questions.
Newly created Node2 has two shards as replica.
Its names are 'cloudexample_shard1_replica3' and
'cloudexample_shard2_replica3'.
Why isn't it 'cloudexample_shard1_replica2' and
'cloudexample_shard2_replica2'?
And cloud graph likes below.

cloudexample -- shard1 -- 10.3.4.20:8983 (Leader,Aactive)
   10.3.4.20:8984 (Down)
   10.3.4.20:8984 (Active)
 -- shard2 -- 10.3.4.20:8983 (Leader,Aactive)
   10.3.4.20:8984 (Down)
   10.3.4.20:8984 (Active)

As shown the graph, two replicas turned out down.
What happen? How should I do?




Erick Erickson wrote
> Let Solr do it for you. In this case use the collections API
> ADDREPLICA command. You tell it what collection and
> shard you want replicated, it'll create the right structure in the
> right place and synch the index.
> 
> See:
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica
> 
> If you require exact placement of the replica, specify the "node"
> parameter.
> 
> Best,
> Erick
> 
> On Fri, Aug 19, 2016 at 2:19 AM, Jason 

> hialooha@

>  wrote:
>> I'm new to solrcloud (ver.6.1). So, I'm in tutorial.
>> I tried to force removing specific node because I was wondering how to
>> recover a node.
>> But I can't solve the problem.
>>
>> Below is my scenario.
>>
>>> bin/solr -e cloud
>> # make 2 nodes with 8983, 8984 port
>> # collection structure is below
>> # example/cloud/node1 (with 8983)
>> # /cloudexample_shard1_replica2
>> # /cloudexample_shard2_replica2
>> # example/cloud/node2 (with 8984)
>> # /cloudexample_shard1_replica1
>> # /cloudexample_shard2_replica1
>>
>>> java -Dc=gettingstarted -jar post.jar *.xml
>> # indexing and searching is ok.
>>
>>> bin/solr stop -p 8984
>> # force to shutdown node2 and remove example/cloud/node2
>>
>>> mkdir -p example/cloud/new_node2/solr
>>> cp example/cloud/node1/solr/solr.xml example/cloud/new_node2/solr
>>> bin/solr start -c -p 8984 -z localhost:9983 -x
>>> example/cloud/new_node2/solr
>>
>> everything is good so far, but after this how should I recover node2 same
>> as
>> before remove?
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/How-to-recovery-node-tp4292354.html
>> Sent from the Solr - User mailing list archive at Nabble.com.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-recovery-node-or-replica-in-solrcloud-tp4292354p4292417.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to recovery node

2016-08-19 Thread Erick Erickson
Let Solr do it for you. In this case use the collections API
ADDREPLICA command. You tell it what collection and
shard you want replicated, it'll create the right structure in the
right place and synch the index.

See: 
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica

If you require exact placement of the replica, specify the "node"
parameter.

Best,
Erick

On Fri, Aug 19, 2016 at 2:19 AM, Jason  wrote:
> I'm new to solrcloud (ver.6.1). So, I'm in tutorial.
> I tried to force removing specific node because I was wondering how to
> recover a node.
> But I can't solve the problem.
>
> Below is my scenario.
>
>> bin/solr -e cloud
> # make 2 nodes with 8983, 8984 port
> # collection structure is below
> # example/cloud/node1 (with 8983)
> # /cloudexample_shard1_replica2
> # /cloudexample_shard2_replica2
> # example/cloud/node2 (with 8984)
> # /cloudexample_shard1_replica1
> # /cloudexample_shard2_replica1
>
>> java -Dc=gettingstarted -jar post.jar *.xml
> # indexing and searching is ok.
>
>> bin/solr stop -p 8984
> # force to shutdown node2 and remove example/cloud/node2
>
>> mkdir -p example/cloud/new_node2/solr
>> cp example/cloud/node1/solr/solr.xml example/cloud/new_node2/solr
>> bin/solr start -c -p 8984 -z localhost:9983 -x
>> example/cloud/new_node2/solr
>
> everything is good so far, but after this how should I recover node2 same as
> before remove?
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-recovery-node-tp4292354.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: posting to Solr 6.0.0

2016-08-19 Thread matt corkum
I managed to solve this with the following approach.

create core ~/Downloads/solr-6.0.0/bin/solr create -c poc5
use the schema API of Solr 6.0.0 & push the schema style for each field (I have 
a few fields)
Here is one example of changing the “all" field
curl -X POST -H 'Content-type:application/json' --data-binary 
'{"add-field":{"name”:"all" ,"type":"text_en", 
"stored”:"true","indexed”:”true”, "multiVauled”:”true"} }' 
http://localhost:8983/solr/poc5/schema 
Post the docs once the schema is defined
~/Downloads/solr-6.0.0/bin/post -c poc5 .

I now get multiple valued fields and the type of processing I want.

I bet there is an easier way, but this is working !

Thanks
Matt

> On Aug 19, 2016, at 8:06 AM, matt corkum  wrote:
> 
> Hi —
> 
> I looking for a way to send many JSON files to a Classic Index Schema (not 
> managed) using something.
> 
> I created a schema.xml and a solrconfig.xml that allows me to post JSON using 
> the Solr Admin document 
> Using   in solrconfig.xml, 
> so it’s not managed schema.
> 
> Post my JSON (below) to the Solr Admin post a document—> 
> http://localhost:8983/solr/#/poc2/documents 
>  WORKS ! (creates a field / 
> value for each of the tuples below)
> So this assumes the schema.xml is valid and working I get a multiple valued 
> field.
> 
> however when I use the bin/post tool or curl POST doing this (they both 
> produce only id, all, _src_, and _version_). The _src_ has my original 
> document below.
> 
> Anyone have a suggestion on the proper use of curl or the Solr post tool 
> properly (Data Handler?) to post a directory of JSON?
> Do I need to adjust my solrconfig.xml to make something work. 
> 
> Thank you
> 
> Here is a sample curl
> 
> curl -X POST -H 'Content-Type: application/json' 
> 'http://localhost:8983/solr/poc2/update/json/docs' 
>  --data-binary '   
>  
> { 
>  
>  "eid": "1-s2.0-S073510971104383X",   
>  
>  "contentsubtype": "PGL", 
>  
>  "issn": "07351097",  
>   
>
>  "all": "JACC (Journal of the American College of Cardiology) 2011 ACCF/AHA 
> Guideline for the Diagnosis and Treatment of Hypertrop\
> hic Cardiomyopathy: Executive Summary Recommendations for HCM Stress 
> Testing—Recommendations 2.4 Class IIa 1 Treadmill exercise te\
> sting is reasonable to determine functional capacity and response to therapy 
> in patients with HCM. (Level of Evidence: C) 2 Treadm\
> ill testing with monitoring of an ECG and blood pressure is reasonable for 
> SCD risk stratification in patients with HCM ( 69–71 ).\
>  (Level of Evidence: B) 3 In patients with HCM who do not have a resting peak 
> instantaneous gradient of greater than or equal to 5\
> 0 mm Hg, exercise echocardiography is reasonable for the detection and 
> quantification of exercise-induced dynamic LVOT obstruction\
>  ( 67,70–72 ). (Level of Evidence: B)",   
>  
>   
> 
> }’
> 
> here is a sample Solr post: ~/solr-6.0.0/bin/post -c poc2 .  (done in the 
> directory of JSON or it’s parent) — all docs are visited and sent to the Solr 
> index (only getting the id, _src_, all, and _version_ fields.
> 
> I appreciate any comments on what might need adjustment.
> 
> I will see about reverse engineering the POST of the Solr Admin to discover 
> the post of the JSON (it has to be different in some manner).
> 
> Thank you
> Matt



Re: DataImport-Solr6: Nested Entities

2016-08-19 Thread Shawn Heisey
On 8/18/2016 5:10 PM, Peri Subrahmanya wrote:
> Hi,
>
> I have a simple one-to-many relationship setup in the data-import.xml and 
> when I try to index it using the dataImportHandler, Solr complains of “no 
> unique id found”. 
>
> managed-schema.xml
> id
> solrconfig,xml:
> 
>   
> id
>   

>  query=“select blah blah from course where 
> catalog_id=‘${catalog.catalog_id}'">
> 

Can you get the full error message(s) from the solr.log file, including
the full java stacktrace(s)?  Many error messages are dozens of lines
long, because they include Java stacktraces.  For correct
interpretation, we also need the exact version of Solr that you're
running.  Your subject indicates Solr6, but there are three releases so
far in the 6.x series.

If you want your update processor chain to be used by DIH, I think you
need to make it the default chain with 'default="true"' in the opening
tag.  There might be a way to apply a specific update chain in DIH, but
if there is, you need to give it a name, which yours doesn't have.

I am using a custom update chain with both DIH and explicit update
requests, which I do like this:



Thanks,
Shawn



posting to Solr 6.0.0

2016-08-19 Thread matt corkum
Hi —

I looking for a way to send many JSON files to a Classic Index Schema (not 
managed) using something.

I created a schema.xml and a solrconfig.xml that allows me to post JSON using 
the Solr Admin document 
Using   in solrconfig.xml, so 
it’s not managed schema.

Post my JSON (below) to the Solr Admin post a document—> 
http://localhost:8983/solr/#/poc2/documents 
 WORKS ! (creates a field / value 
for each of the tuples below)
So this assumes the schema.xml is valid and working I get a multiple valued 
field.

however when I use the bin/post tool or curl POST doing this (they both produce 
only id, all, _src_, and _version_). The _src_ has my original document below.

Anyone have a suggestion on the proper use of curl or the Solr post tool 
properly (Data Handler?) to post a directory of JSON?
Do I need to adjust my solrconfig.xml to make something work. 

Thank you

Here is a sample curl

curl -X POST -H 'Content-Type: application/json' 
'http://localhost:8983/solr/poc2/update/json/docs' 
 --data-binary ' 
   
{   
   
 "eid": "1-s2.0-S073510971104383X", 
   
 "contentsubtype": "PGL",   
   
 "issn": "07351097",

   
 "all": "JACC (Journal of the American College of Cardiology) 2011 ACCF/AHA 
Guideline for the Diagnosis and Treatment of Hypertrop\
hic Cardiomyopathy: Executive Summary Recommendations for HCM Stress 
Testing—Recommendations 2.4 Class IIa 1 Treadmill exercise te\
sting is reasonable to determine functional capacity and response to therapy in 
patients with HCM. (Level of Evidence: C) 2 Treadm\
ill testing with monitoring of an ECG and blood pressure is reasonable for SCD 
risk stratification in patients with HCM ( 69–71 ).\
 (Level of Evidence: B) 3 In patients with HCM who do not have a resting peak 
instantaneous gradient of greater than or equal to 5\
0 mm Hg, exercise echocardiography is reasonable for the detection and 
quantification of exercise-induced dynamic LVOT obstruction\
 ( 67,70–72 ). (Level of Evidence: B)", 
   

  
}’

here is a sample Solr post: ~/solr-6.0.0/bin/post -c poc2 .  (done in the 
directory of JSON or it’s parent) — all docs are visited and sent to the Solr 
index (only getting the id, _src_, all, and _version_ fields.

I appreciate any comments on what might need adjustment.

I will see about reverse engineering the POST of the Solr Admin to discover the 
post of the JSON (it has to be different in some manner).

Thank you
Matt

Re: Append to new solr.log file daily

2016-08-19 Thread Alexandre Rafalovitch
This _should_ be controlled by the resources/log4j.properties file.
Which I thought was configured to have a rolling append.

What does yours have?

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 19 August 2016 at 17:47, Zheng Lin Edwin Yeo  wrote:
> Hi,
>
> I found that all the logs in Solr are appended in the same solr.log file
> until Solr is restarted. If I keep Solr running for several weeks without
> restarting (which should be the case in production environment), the size
> of the solr.log can be very large (more than several GB in some cases).
>
> Is it possible to do some configuration, so that the log is append to a new
> solr.log file every day, instead of until Solr is restarted? This will make
> reviewing of the logs much easier.
>
> I'm using Solr 6.1.0
>
> Regards,
> Edwin


How to recovery node

2016-08-19 Thread Jason
I'm new to solrcloud (ver.6.1). So, I'm in tutorial.
I tried to force removing specific node because I was wondering how to
recover a node.
But I can't solve the problem.

Below is my scenario.

> bin/solr -e cloud 
# make 2 nodes with 8983, 8984 port
# collection structure is below
# example/cloud/node1 (with 8983)
# /cloudexample_shard1_replica2
# /cloudexample_shard2_replica2
# example/cloud/node2 (with 8984)
# /cloudexample_shard1_replica1
# /cloudexample_shard2_replica1

> java -Dc=gettingstarted -jar post.jar *.xml
# indexing and searching is ok.

> bin/solr stop -p 8984
# force to shutdown node2 and remove example/cloud/node2

> mkdir -p example/cloud/new_node2/solr
> cp example/cloud/node1/solr/solr.xml example/cloud/new_node2/solr
> bin/solr start -c -p 8984 -z localhost:9983 -x
> example/cloud/new_node2/solr

everything is good so far, but after this how should I recover node2 same as
before remove?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-recovery-node-tp4292354.html
Sent from the Solr - User mailing list archive at Nabble.com.


Append to new solr.log file daily

2016-08-19 Thread Zheng Lin Edwin Yeo
Hi,

I found that all the logs in Solr are appended in the same solr.log file
until Solr is restarted. If I keep Solr running for several weeks without
restarting (which should be the case in production environment), the size
of the solr.log can be very large (more than several GB in some cases).

Is it possible to do some configuration, so that the log is append to a new
solr.log file every day, instead of until Solr is restarted? This will make
reviewing of the logs much easier.

I'm using Solr 6.1.0

Regards,
Edwin


Auto-deletion of Solr logs

2016-08-19 Thread Zheng Lin Edwin Yeo
Hi,

Would like to check, is there any setting in Solr which we can set to
auto-delete the old Solr logs (Eg: those older than one month), or do we
have to write a separate script in order to achieve that?
Those are the logs that are stored in \server\logs.

I'm using Solr 6.1.0.

Regards,
Edwin


Re: Solr 6: Use facet with Streaming Expressions- LeftOuterJoin

2016-08-19 Thread vrindavda
Thanks again !

I will try this and followup.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-6-Use-facet-with-Streaming-Expressions-LeftOuterJoin-tp4290526p4292341.html
Sent from the Solr - User mailing list archive at Nabble.com.