Re: Preventing solr cache flush when committing

2018-04-24 Thread Shawn Heisey
On 4/23/2018 11:56 PM, Papa Pappu wrote:
> I've written down my query over stack-overflow. Here is the link for that :
> https://stackoverflow.com/questions/49993681/preventing-solr-cache-flush-when-commiting
>
> In short, I am facing troubles maintaining my solr caches when commits
> happen and the question provides detailed description of the same.

The information in Solr caches rely on Lucene internal doc IDs.

When changes to the index happen and a new searcher is created, there is
absolutely no guarantee that the Lucene document IDs will be the same as
they were on the old searcher.  Solr must assume that the IDs are
different, so it has no choice but to throw away its cache entries when
a new searcher is created.

> Based on my use-case if someone can recommend what settings I should use or
> practices I should follow it'll be really helpful.

This is similar information as you got in the SO post.  You can rely on
newSearcher cache warming, and the autowarming configured in the caches
themselves.  Be careful about making autowarmCount too large.  Large
values there can make commits very slow.

The basic advice for getting the most out of Solr caches is to put off
opening new searchers as long as you can.  Commit less frequently.

Thanks,
Shawn



Re: solr cell: write entire file content binary to index along with metadata

2018-04-24 Thread Shawn Heisey
On 4/24/2018 10:26 AM, Lee Carroll wrote:
> Does the solr cell contrib give access to the files raw content  along with
> the extracted metadata?\

That's not usually the kind of information you want to have in a Solr
index.  Most of the time, there will be an entry in the Solr index that
tells the system making queries how to locate the actual data -- a
filename, a URL, a database lookup key, etc.

I have no idea whether solr-cell can put the info in the index.  My best
guess would be that it can't, since putting the entire binary content
into the index isn't recommended.

We don't recommend using solr-cell for production indexing.  If you
follow recommendations and write your own indexing program using Tika,
then you can do pretty much anything you want, including writing the
full content into the index.

Thanks,
Shawn



Re: IndexFetcher cannot download index file

2018-04-24 Thread Shawn Heisey
On 4/24/2018 1:53 PM, Markus Jelsma wrote:
> I don't see stack traces for most WARNs, for example the checksum
> warning on recovery (other thread), or the Trie* deprecations. 

I just tried it on 7.3.0.  Added a line to CoreContainer.java to log an
exception at warn when Solr is starting:

    log.warn("Start warning!", new Exception());

This produces the following in the logging tab:

https://www.dropbox.com/s/6q5x6gcidutg01a/solr-730-warn-stacktrace-log.png?dl=0

The logging tab has the annoying habit of refreshing and hiding the
stacktrace very quickly after opening it.

I can't reproduce what you're saying that you see with no stacktraces on
WARN messages in the admin UI.

Thanks,
Shawn



Re: Solr 7.3 debug/explain with boost applied

2018-04-24 Thread Nawab Zada Asad Iqbal
I didn't know you can add boosts like that (=2 ). Are you boosting on
a field or document by using that syntax?

On Sun, Apr 22, 2018 at 10:51 PM, Ryan Yacyshyn 
wrote:

> Hi all,
>
> When viewing the explain under debug=true in Solr 7.3.0 using
> the edismax query parser with a boost, I only see the "boost" part of the
> explain. Without applying a boost I see the full explain. Is this the
> expected behaviour?
>
> Here's how to check using the techproducts example..
>
> bin/solr -e techproducts
>
> ```
> http://localhost:8983/solr/techproducts/select?q={!
> edismax}samsung=name=true
> ```
>
> returns:
>
> ```
> "debug": {
> "rawquerystring": "{!edismax}samsung",
> "querystring": "{!edismax}samsung",
> "parsedquery": "+DisjunctionMaxQuery((name:samsung))",
> "parsedquery_toString": "+(name:samsung)",
> "explain": {
>   "SP2514N": "\n2.3669035 = weight(name:samsung in 1)
> [SchemaSimilarity], result of:\n  2.3669035 = score(doc=1,freq=1.0 =
> termFreq=1.0\n), product of:\n2.6855774 = idf, computed as log(1 +
> (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n  1.0 = docFreq\n
> 21.0 = docCount\n0.8813388 = tfNorm, computed as (freq * (k1 + 1))
> / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:\n  1.0
> = termFreq=1.0\n  1.2 = parameter k1\n  0.75 = parameter b\n
> 7.5238094 = avgFieldLength\n  10.0 = fieldLength\n"
> },
> "QParser": "ExtendedDismaxQParser",
> ...
> ```
>
> If I just add =2 to this, I get this explain back:
>
> ```
> "debug": {
> "rawquerystring": "{!edismax}samsung",
> "querystring": "{!edismax}samsung",
> "parsedquery": "FunctionScoreQuery(FunctionScoreQuery(+(name:samsung),
> scored by boost(const(2",
> "parsedquery_toString": "FunctionScoreQuery(+(name:samsung), scored by
> boost(const(2)))",
> "explain": {
>   "SP2514N": "\n4.733807 = product of:\n  1.0 = boost\n  4.733807 =
> boost(const(2))\n"
> },
> "QParser": "ExtendedDismaxQParser",
> ...
> ```
>
> Is this normal? I was expecting to see more like the first example, with
> the addition of the boost applied.
>
> Thanks,
> Ryan
>


RE: IndexFetcher cannot download index file

2018-04-24 Thread Markus Jelsma
Inline.

-Original message-
> From:Shawn Heisey 
> Sent: Tuesday 24th April 2018 21:18
> To: solr-user@lucene.apache.org
> Subject: Re: IndexFetcher cannot download index file
> 
> On 4/24/2018 12:36 PM, Markus Jelsma wrote:
> > I should be more precise, i said the stack traces of WARN are not shown, 
> > only the messages are visible. The 'low disk space' line was hidden in the 
> > stack trace of the WARN, as you can see in the pasted example, thus 
> > invisible in the GUI with default settings.
> 
> Hmm.  I can see stacktraces on WARN messages in 6.6 and earlier versions
> in the admin UI.  I don't have running servers with newer versions to
> check.  That sounds a little like a bug.

I don't see stack traces for most WARNs, for example the checksum warning on 
recovery (other thread), or the Trie* deprecations.

> 
> > If the log level of the message were raised to ERROR, it would be visible. 
> > I would think that any recovery WARN message should be raised to ERROR or 
> > even FATAL, because in this state whatever Solr does, it will never recover 
> > and in the mean time not show why the recovery failed.
> 
> That particular log probably SHOULD be an error.  Will look into that
> when I have some time.

I'll open a ticket tomorrow so we can track it.

Thanks!
Markus

> 
> Thanks,
> Shawn
> 
> 


Re: IndexFetcher cannot download index file

2018-04-24 Thread Shawn Heisey
On 4/24/2018 12:36 PM, Markus Jelsma wrote:
> I should be more precise, i said the stack traces of WARN are not shown, only 
> the messages are visible. The 'low disk space' line was hidden in the stack 
> trace of the WARN, as you can see in the pasted example, thus invisible in 
> the GUI with default settings.

Hmm.  I can see stacktraces on WARN messages in 6.6 and earlier versions
in the admin UI.  I don't have running servers with newer versions to
check.  That sounds a little like a bug.

> If the log level of the message were raised to ERROR, it would be visible. I 
> would think that any recovery WARN message should be raised to ERROR or even 
> FATAL, because in this state whatever Solr does, it will never recover and in 
> the mean time not show why the recovery failed.

That particular log probably SHOULD be an error.  Will look into that
when I have some time.

Thanks,
Shawn



Search support for regex style spaces

2018-04-24 Thread tedsolr
Does Solr have regex search support for "\s"? as in: q=FIELD:/starts
with[\s0-9]*/

Both \s and \\s do not seem to have an effect. thanks
using solr 5.5.4



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: IndexFetcher cannot download index file

2018-04-24 Thread Markus Jelsma
To be even more precise, it seems some WARN logs do show a stack trace in the 
GUI, but others don't. For example:

org.apache.solr.common.SolrException: URLDecoder: Invalid character encoding 
detected after position 23 of query string / form data (while parsing as UTF-8)
at 
org.apache.solr.servlet.SolrRequestParsers.decodeChars(SolrRequestParsers.java:421)
at 
org.apache.solr.servlet.SolrRequestParsers.decodeBuffer(SolrRequestParsers.java:437)
at 
org.apache.solr.servlet.SolrRequestParsers.parseFormDataContent(SolrRequestParsers.java:404)
at 
org.apache.solr.servlet.SolrRequestParsers.parseQueryString(SolrRequestParsers.java:304)

I don't know why there is a difference in showing stack traces.

 
 
-Original message-
> From:Markus Jelsma 
> Sent: Tuesday 24th April 2018 20:36
> To: solr-user@lucene.apache.org
> Subject: RE: IndexFetcher cannot download index file
> 
> Hello Shawn,
> 
> I should be more precise, i said the stack traces of WARN are not shown, only 
> the messages are visible. The 'low disk space' line was hidden in the stack 
> trace of the WARN, as you can see in the pasted example, thus invisible in 
> the GUI with default settings.
> 
> If the log level of the message were raised to ERROR, it would be visible. I 
> would think that any recovery WARN message should be raised to ERROR or even 
> FATAL, because in this state whatever Solr does, it will never recover and in 
> the mean time not show why the recovery failed.
> 
> What do you think?
> 
> Regards,
> Markus
> 
>  
>  
> -Original message-
> > From:Shawn Heisey 
> > Sent: Tuesday 24th April 2018 19:12
> > To: solr-user@lucene.apache.org
> > Subject: Re: IndexFetcher cannot download index file
> > 
> > On 4/24/2018 9:46 AM, Markus Jelsma wrote:
> > > Disk space was WARN level. It seems only stack traces of ERROR level 
> > > messages are visible via the GUI, and that is where the 'No space left' 
> > > was hiding. Without logging in and inspecting the logs manually, you will 
> > > never notice that message.
> > 
> > The logging tab in the admin UI will by default show all messages at
> > WARN and higher severity.  If you're not seeing WARN messages there,
> > then a configuration has been changed to get rid of them.
> > 
> > Regarding general disk space requirements:  There should always be
> > enough disk space so that all the indexes can double in size
> > temporarily.  To be absolutely certain you won't run out, it should be
> > enough space so they can triple in size temporarily -- there is a
> > certain indexing scenario where this can happen in the wild. 
> > Replication can also create an entirely separate copy of the index
> > temporarily, so the same disk space recommendations apply for that too.
> > 
> > Thanks,
> > Shawn
> > 
> > 
> 


RE: IndexFetcher cannot download index file

2018-04-24 Thread Markus Jelsma
Hello Shawn,

I should be more precise, i said the stack traces of WARN are not shown, only 
the messages are visible. The 'low disk space' line was hidden in the stack 
trace of the WARN, as you can see in the pasted example, thus invisible in the 
GUI with default settings.

If the log level of the message were raised to ERROR, it would be visible. I 
would think that any recovery WARN message should be raised to ERROR or even 
FATAL, because in this state whatever Solr does, it will never recover and in 
the mean time not show why the recovery failed.

What do you think?

Regards,
Markus

 
 
-Original message-
> From:Shawn Heisey 
> Sent: Tuesday 24th April 2018 19:12
> To: solr-user@lucene.apache.org
> Subject: Re: IndexFetcher cannot download index file
> 
> On 4/24/2018 9:46 AM, Markus Jelsma wrote:
> > Disk space was WARN level. It seems only stack traces of ERROR level 
> > messages are visible via the GUI, and that is where the 'No space left' was 
> > hiding. Without logging in and inspecting the logs manually, you will never 
> > notice that message.
> 
> The logging tab in the admin UI will by default show all messages at
> WARN and higher severity.  If you're not seeing WARN messages there,
> then a configuration has been changed to get rid of them.
> 
> Regarding general disk space requirements:  There should always be
> enough disk space so that all the indexes can double in size
> temporarily.  To be absolutely certain you won't run out, it should be
> enough space so they can triple in size temporarily -- there is a
> certain indexing scenario where this can happen in the wild. 
> Replication can also create an entirely separate copy of the index
> temporarily, so the same disk space recommendations apply for that too.
> 
> Thanks,
> Shawn
> 
> 


Re: Preventing solr cache flush when committing

2018-04-24 Thread Lee Carroll
>From memory try the following:
Don't manually commit from client after batch indexing
set soft commit to be a a long time interval. As long as acceptable to run
stale, say 5 mins or longer if you can.
set hard commit to be short   (seconds ) to keep everything neat and tidy
regards updates and avoid backing up log files
set opensearcher=false

I'm pretty sure that works for at least one of our indices. It's worth a go.

Lee C

On 24 April 2018 at 06:56, Papa Pappu  wrote:

> Hi,
> I've written down my query over stack-overflow. Here is the link for that :
> https://stackoverflow.com/questions/49993681/preventing-
> solr-cache-flush-when-commiting
>
> In short, I am facing troubles maintaining my solr caches when commits
> happen and the question provides detailed description of the same.
>
> Based on my use-case if someone can recommend what settings I should use or
> practices I should follow it'll be really helpful.
>
> Thanks and regards,
> Dmitri
>


Re: IndexFetcher cannot download index file

2018-04-24 Thread Shawn Heisey
On 4/24/2018 9:46 AM, Markus Jelsma wrote:
> Disk space was WARN level. It seems only stack traces of ERROR level messages 
> are visible via the GUI, and that is where the 'No space left' was hiding. 
> Without logging in and inspecting the logs manually, you will never notice 
> that message.

The logging tab in the admin UI will by default show all messages at
WARN and higher severity.  If you're not seeing WARN messages there,
then a configuration has been changed to get rid of them.

Regarding general disk space requirements:  There should always be
enough disk space so that all the indexes can double in size
temporarily.  To be absolutely certain you won't run out, it should be
enough space so they can triple in size temporarily -- there is a
certain indexing scenario where this can happen in the wild. 
Replication can also create an entirely separate copy of the index
temporarily, so the same disk space recommendations apply for that too.

Thanks,
Shawn



Preventing solr cache flush when committing

2018-04-24 Thread Papa Pappu
Hi,
I've written down my query over stack-overflow. Here is the link for that :
https://stackoverflow.com/questions/49993681/preventing-solr-cache-flush-when-commiting

In short, I am facing troubles maintaining my solr caches when commits
happen and the question provides detailed description of the same.

Based on my use-case if someone can recommend what settings I should use or
practices I should follow it'll be really helpful.

Thanks and regards,
Dmitri


Re: versions of documentation: suggestion for improvement

2018-04-24 Thread Chris Hostetter

: I also noticed that there's the concept of "latest" (similar to "current"
: in postgres documentation) in solr. This is pretty cool. I am afraid
: though, that this currently is somewhat confusing. E.g., if I search for
: managed schema in google I get this as 1st url:
: 
: 
https://lucene.apache.org/solr/guide/6_6/schema-factory-definition-in-solrconfig.html
: 
: now if I try to replace 6_6 with latest:

i'm not sure where you got the impression that *replacing* a version 
number with "/latest" in the URL was designed to work ... the redirects 
are setup so that if you *remove* the version number, the rest of the path 
will redirect to the current version.

https://lucene.apache.org/solr/guide/schema-factory-definition-in-solrconfig.html


-Hoss
http://www.lucidworks.com/


CDCR Bootstrap

2018-04-24 Thread Susheel Kumar
Hello,

I am wondering under what different conditions does that CDCR bootstrap
process gets triggered.  I did notice it getting triggered after I stopped
CDCR and then started again later and now I am trying to reproduce the same
behavior.

In case target cluster is left behind and buffer was disabled on source, i
would like the CDCR bootstrap to trigger and sync target.

Does deleting records from target and then starting CDCR would trigger
bootstrap ?

Thanks,
Susheel


solr cell: write entire file content binary to index along with metadata

2018-04-24 Thread Lee Carroll
Does the solr cell contrib give access to the files raw content  along with
the extracted metadata?

cheers Lee C


Re: IndexFetcher cannot download index file

2018-04-24 Thread Charlie Hull

On 24/04/2018 16:44, Walter Underwood wrote:

In Ultraseek, we checked free disk space before starting a merge or 
replication. If there wasn’t enough space, it emailed an error to the admin and 
disabled merging or replication, respectively.

Checking free disk space on Windows was a pain.


On a related topic, we built something that can block connections if 
there's no space to accept new documents for indexing:

https://github.com/flaxsearch/harahachibu

Cheers

Charlie


wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


On Apr 24, 2018, at 8:39 AM, Shawn Heisey  wrote:

On 4/24/2018 6:52 AM, Markus Jelsma wrote:

Forget about it, recovery got a java.io.IOException: No space left on device 
but it wasn't clear until i inspected the real logs.

The logs in de web admin didn't show the disk space exception, even when i 
expand the log line. Maybe that could be changed.


What was the severity of the log entry showing the disk space exception?  Can 
you share the whole message/stacktrace?

If it doesn't show up in the admin UI logging tab, that would suggest that it 
was an INFO level log, when it should probably be ERROR.

Thanks,
Shawn







--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


RE: IndexFetcher cannot download index file

2018-04-24 Thread Markus Jelsma
Hello,

Disk space was WARN level. It seems only stack traces of ERROR level messages 
are visible via the GUI, and that is where the 'No space left' was hiding. 
Without logging in and inspecting the logs manually, you will never notice that 
message.

Regards,
Markus

2018-04-24 12:23:44.215 WARN  
(recoveryExecutor-4-thread-1-processing-n:idx3.gr.nl.openindex.io:8983_solr 
x:search_shard1_replica2 s:shard1 c:search r:core_node3) [c:search s:shard1 
r:core_
node3 x:search_shard1_replica2] o.a.s.h.IndexFetcher Error in fetching file: 
_l5d.cfs (downloaded 48234496 of 218562805 bytes)
java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:101)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at 
org.apache.lucene.store.FSDirectory$FSIndexOutput$1.write(FSDirectory.java:419)
at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:73)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at 
org.apache.lucene.store.OutputStreamIndexOutput.writeBytes(OutputStreamIndexOutput.java:53)
at 
org.apache.solr.handler.IndexFetcher$DirectoryFile.write(IndexFetcher.java:1749)



 
 
-Original message-
> From:Shawn Heisey 
> Sent: Tuesday 24th April 2018 17:39
> To: solr-user@lucene.apache.org
> Subject: Re: IndexFetcher cannot download index file
> 
> On 4/24/2018 6:52 AM, Markus Jelsma wrote:
> > Forget about it, recovery got a java.io.IOException: No space left on 
> > device but it wasn't clear until i inspected the real logs.
> >
> > The logs in de web admin didn't show the disk space exception, even when i 
> > expand the log line. Maybe that could be changed.
> 
> What was the severity of the log entry showing the disk space 
> exception?  Can you share the whole message/stacktrace?
> 
> If it doesn't show up in the admin UI logging tab, that would suggest 
> that it was an INFO level log, when it should probably be ERROR.
> 
> Thanks,
> Shawn
> 
> 


Re: IndexFetcher cannot download index file

2018-04-24 Thread Walter Underwood
In Ultraseek, we checked free disk space before starting a merge or 
replication. If there wasn’t enough space, it emailed an error to the admin and 
disabled merging or replication, respectively.

Checking free disk space on Windows was a pain.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 24, 2018, at 8:39 AM, Shawn Heisey  wrote:
> 
> On 4/24/2018 6:52 AM, Markus Jelsma wrote:
>> Forget about it, recovery got a java.io.IOException: No space left on device 
>> but it wasn't clear until i inspected the real logs.
>> 
>> The logs in de web admin didn't show the disk space exception, even when i 
>> expand the log line. Maybe that could be changed.
> 
> What was the severity of the log entry showing the disk space exception?  Can 
> you share the whole message/stacktrace?
> 
> If it doesn't show up in the admin UI logging tab, that would suggest that it 
> was an INFO level log, when it should probably be ERROR.
> 
> Thanks,
> Shawn
> 



Re: IndexFetcher cannot download index file

2018-04-24 Thread Shawn Heisey

On 4/24/2018 6:52 AM, Markus Jelsma wrote:

Forget about it, recovery got a java.io.IOException: No space left on device 
but it wasn't clear until i inspected the real logs.

The logs in de web admin didn't show the disk space exception, even when i 
expand the log line. Maybe that could be changed.


What was the severity of the log entry showing the disk space 
exception?  Can you share the whole message/stacktrace?


If it doesn't show up in the admin UI logging tab, that would suggest 
that it was an INFO level log, when it should probably be ERROR.


Thanks,
Shawn



Re: SolrCloud DIH (Data Import Handler) MySQL 404

2018-04-24 Thread Shawn Heisey

On 4/24/2018 2:03 AM, msaunier wrote:

If I access to the interface, I have a null pointer exception:

null:java.lang.NullPointerException
at 
org.apache.solr.handler.RequestHandlerBase.getVersion(RequestHandlerBase.java:233)


The line of code where this exception occurred uses fundamental Java 
methods. Based on the error, either the getClass method common to all 
java objects, or the getPackage method on the class, is returning null.  
That shouldn't be possible.  This has me wondering whether there is 
something broken in your particular Solr installation -- corrupt jars, 
or something like that.  Or maybe something broken in your Java.


Thanks,
Shawn



Re: SolrCloud cluster does not accept new documents for indexing

2018-04-24 Thread Shawn Heisey

On 4/24/2018 6:30 AM, Chris Ulicny wrote:

I haven't worked with AWS, but recently we tried to move some of our solr
instances to a cloud in Google's Cloud offering, and it did not go well.
All of our problems ended up stemming from the fact that the I/O is
throttled. Any complicated enough query would require too many disk reads
to return the results in a reasonable time when being throttled. SSDs were
better but not a practical cost and not as performant as our own bare metal.


If there's enough memory installed beyond what is required for the Solr 
heap, then Solr will rarely need to actually read the disk to satisfy a 
query.  That is the secret to stellar performance.  If switching to 
faster disks made a big difference in query performance, adding memory 
would yield an even greater improvement.


https://wiki.apache.org/solr/SolrPerformanceProblems#RAM


When we were doing the initial indexing, the indexing processes would get
to a point where the updates were taking minutes to complete and the cause
was throttled write ops.


Indexing speed is indeed affected by disk speed, and adding memory can't 
fix that particular problem.  Using a storage controller with a large 
amount of battery-backed cache memory can improve it.



-- set the max threads and max concurrent merges of the mergeScheduler to
be 1 (or very low). This prevented excessive IO during indexing.


The max threads should be at 1 in the merge scheduler, but the max 
merges should actually be *increased*.  I use a value of 6 for that.  
With SSD disks, the max threads can be increased, but I wouldn't push it 
very high.


Thanks,
Shawn



Re: Using Solr / Lucene with OpenJDK

2018-04-24 Thread Shawn Heisey

On 4/24/2018 8:50 AM, Steven White wrote:

We currently support both Oracle and IBM Java to run Solr and I'm task to
switch over to OpenJDK.


Oracle Java is the preferred choice.  OpenJDK should be work very well, 
as long as it's at least version 7.  Recent Solr versions require Java 
8, so that should be no problem.


OpenJDK versions before 7, and IBM's Java, are not recommended.  OpenJDK 
6 has known bugs, and IBM's Java enables so many performance enhancement 
optimizations in the language that parts of Lucene misbehave when 
running in IBM Java.



Does anyone use Solr, any version, with OpenJDK?  If so, what has been you
experience?  Also, what platforms have you used it on?


I've used it on Linux.  It worked without issues.


I run Solr on Windows, Linux, AIX and Solaris and on each of those
platforms, I need to support both 32 and 64 bit Java.


Another strong recommendation:  Don't use 32-bit Java. There's nothing 
technically wrong with it, but it artificially limits the size of the 
heap to 2GB.  It doesn't take super-big indexes to require more heap 
memory than that.


As mentioned early on, Oracle Java is what we prefer.  OpenJDK is the 
reference implementation for the language starting with version 7, 
though -- so we can be pretty sure that you're not missing anything by 
using OpenJDK.


Thanks,
Shawn



Re: SolrCloud cluster does not accept new documents for indexing

2018-04-24 Thread Mikhail Khludnev
Denis,
Can you enable infoSteam
https://lucene.apache.org/solr/guide/6_6/indexconfig-in-solrconfig.html#IndexConfiginSolrConfig-OtherIndexingSettings
and examine logs about throttling?
And what if you try without auto-commit?

On Tue, Apr 24, 2018 at 12:37 AM, Denis Demichev  wrote:

> I conducted another experiment today with local SSD drives, but this did
> not seem to fix my problem.
> Don't see any extensive I/O in this case:
>
>
> Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
>
> xvda  1.7688.83 5.521256191  77996
>
> xvdb 13.95   111.30 56663.931573961  801303364
>
> xvdb - is the device where SolrCloud is installed and data files are kept.
>
> What I see:
> - There are 17 "Lucene Merge Thread #..." running. Some of them are
> blocked, some of them are RUNNING
> - updateExecutor-N-thread-M threads are in parked mode and number of docs
> that I am able to submit is still low
> - Tried to change maxIndexingThreads, set it to something high. This seems
> to prolong the time when cluster is accepting new indexing requests and
> keeps CPU utilization a lot higher while the cluster is merging indexes
>
> Could anyone please point me to the right direction (documentation or Java
> classes) where I can read about how data is passed from updateExecutor
> thread pool to Merge Threads? I assume there should be some internal
> blocking queue or something similar.
> Still cannot wrap my head around how Solr blocks incoming connections. Non
> merged indexes are not kept in memory so I don't clearly understand why
> Solr cannot keep writing index file to HDD while other threads are merging
> indexes (since this is a continuous process anyway).
>
> Does anyone use SPM monitoring tool for that type of problems? Is it of
> any use at all?
>
>
> Thank you in advance.
>
> [image: image.png]
>
>
> Regards,
> Denis
>
>
> On Fri, Apr 20, 2018 at 1:28 PM Denis Demichev  wrote:
>
>> Mikhail,
>>
>> Sure, I will keep everyone posted. Moving to non-HVM instance may take
>> some time, so hopefully I will be able to share my observations in the next
>> couple of days or so.
>> Thanks again for all the help.
>>
>> Regards,
>> Denis
>>
>>
>> On Fri, Apr 20, 2018 at 6:02 AM Mikhail Khludnev  wrote:
>>
>>> Denis, please let me know what it ends up with. I'm really curious
>>> regarding this case and AWS instace flavours. fwiw since 7.4 we'll have
>>> ioThrottle=false option.
>>>
>>> On Thu, Apr 19, 2018 at 11:06 PM, Denis Demichev 
>>> wrote:
>>>
 Mikhail, Erick,

 Thank you.

 What just occurred to me - we don't use local SSD but instead we're
 using EBS volumes.
 This was a wrong instance type that I looked at.
 Will try to set up a cluster with SSD nodes and retest.

 Regards,
 Denis


 On Thu, Apr 19, 2018 at 2:56 PM Mikhail Khludnev 
 wrote:

> I'm not sure it's the right context, but here is one guy shows really
> low throthle boundary
> https://issues.apache.org/jira/browse/SOLR-11200?
> focusedCommentId=16115348=com.atlassian.jira.
> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16115348
>
>
> On Thu, Apr 19, 2018 at 8:37 PM, Mikhail Khludnev 
> wrote:
>
>> Threads are hanging on merge io throthling
>>
>> at 
>> org.apache.lucene.index.MergePolicy$OneMergeProgress.pauseNanos(MergePolicy.java:150)
>> at 
>> org.apache.lucene.index.MergeRateLimiter.maybePause(MergeRateLimiter.java:148)
>> at 
>> org.apache.lucene.index.MergeRateLimiter.pause(MergeRateLimiter.java:93)
>> at 
>> org.apache.lucene.store.RateLimitedIndexOutput.checkRate(RateLimitedIndexOutput.java:78)
>>
>> It seems odd. Please confirm that you don't commit on every update
>> request.
>> The only way to monitor io throthling is to enable infostream and
>> read a lot of logs.
>>
>>
>> On Thu, Apr 19, 2018 at 7:59 PM, Denis Demichev 
>> wrote:
>>
>>> Erick,
>>>
>>> Thank you for your quick response.
>>>
>>> I/O bottleneck: Please see another screenshot attached, as you can
>>> see disk r/w operations are pretty low or not significant.
>>> iostat==
>>> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>> xvda  0.00 0.000.000.00 0.00
>>> 0.00 0.00 0.000.000.000.00   0.00   0.00
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>   12.520.000.000.000.00   87.48
>>>
>>> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

Using Solr / Lucene with OpenJDK

2018-04-24 Thread Steven White
Hi everyone,

We currently support both Oracle and IBM Java to run Solr and I'm task to
switch over to OpenJDK.

Does anyone use Solr, any version, with OpenJDK?  If so, what has been your
experience?  Also, what platforms have you used it on?

I run Solr on Windows, Linux, AIX and Solaris and on each of those
platforms, I need to support both 32 and 64 bit Java.

Thanks,

Steven


RE: IndexFetcher cannot download index file

2018-04-24 Thread Markus Jelsma
Forget about it, recovery got a java.io.IOException: No space left on device 
but it wasn't clear until i inspected the real logs.

The logs in de web admin didn't show the disk space exception, even when i 
expand the log line. Maybe that could be changed.

Thanks,
Markus

 
 
-Original message-
> From:Markus Jelsma 
> Sent: Tuesday 24th April 2018 14:39
> To: Solr-user 
> Subject: IndexFetcher cannot download index file
> 
> Hello,
> 
> Slightly different questions/problem, what is going on here on 7.2.1? During 
> the recovery, none of this node's fellow replicas indexes were changed but we 
> still got this error. 
> 
> When we got that error, the recovery was restarted, but shortly after the 
> replicas indexes got updated and recovery failed again. I can understand the 
> second failure, but not the first.
> 
> Right now the recovering node seems to be stuck, it cannot recover before the 
> 15 minute indexing cycle is started. Do i really need to stop the indexer and 
> let the node finish recovery? Can we do something about that?
> 
> Also, why is fetching the full index so slow? It should take just a few 
> minutes to transfer 50 GB between those nodes. While recovering, CPU 
> utilization is normal/low.
> 
> Many thanks,
> Markus
> 
> Error fetching file, doing one 
> retry...:org.apache.solr.common.SolrException: Unable to download _l5d.cfs 
> completely. Downloaded 58720256!=218562805
> Error in fetching file: _l5d.cfs (downloaded 58720256 of 218562805 bytes)
> Error deleting file: _l5d.cfs 
> Index fetch failed :org.apache.solr.common.SolrException: Unable to download 
> _l5d.cfs completely. Downloaded 58720256!=218562805 
> Error while trying to recover:org.apache.solr.common.SolrException: 
> Replication for recovery failed.
> Recovery failed - trying again... (0)
> 


IndexFetcher cannot download index file

2018-04-24 Thread Markus Jelsma
Hello,

Slightly different questions/problem, what is going on here on 7.2.1? During 
the recovery, none of this node's fellow replicas indexes were changed but we 
still got this error. 

When we got that error, the recovery was restarted, but shortly after the 
replicas indexes got updated and recovery failed again. I can understand the 
second failure, but not the first.

Right now the recovering node seems to be stuck, it cannot recover before the 
15 minute indexing cycle is started. Do i really need to stop the indexer and 
let the node finish recovery? Can we do something about that?

Also, why is fetching the full index so slow? It should take just a few minutes 
to transfer 50 GB between those nodes. While recovering, CPU utilization is 
normal/low.

Many thanks,
Markus

Error fetching file, doing one 
retry...:org.apache.solr.common.SolrException: Unable to download _l5d.cfs 
completely. Downloaded 58720256!=218562805
Error in fetching file: _l5d.cfs (downloaded 58720256 of 218562805 bytes)
Error deleting file: _l5d.cfs 
Index fetch failed :org.apache.solr.common.SolrException: Unable to download 
_l5d.cfs completely. Downloaded 58720256!=218562805 
Error while trying to recover:org.apache.solr.common.SolrException: Replication 
for recovery failed.
Recovery failed - trying again... (0)


Re: SolrCloud cluster does not accept new documents for indexing

2018-04-24 Thread Chris Ulicny
I haven't worked with AWS, but recently we tried to move some of our solr
instances to a cloud in Google's Cloud offering, and it did not go well.
All of our problems ended up stemming from the fact that the I/O is
throttled. Any complicated enough query would require too many disk reads
to return the results in a reasonable time when being throttled. SSDs were
better but not a practical cost and not as performant as our own bare metal.

I'm not sure if that is what is happening in your case since it seemed like
your CPU time was mostly idle instead of I/O waits, but your case sounds a
lot like our when we started indexing in the cloud instances. There might
be an equivalent metric for AWS, but Google had the number of throttled
reads and writes available (albeit through StackDriver) that we could track.

When we were doing the initial indexing, the indexing processes would get
to a point where the updates were taking minutes to complete and the cause
was throttled write ops.

A few things we did to get everything indexing at a reasonable rate for the
initial setup:
-- autoCommit set to something very very low, like 10-15 seconds, and
openSearcher set to false
-- autoSoftCommit set to 1 hour or more (our indexing took days) to avoid
unnecessary read operations during indexing.
-- left the RAM buffer/buffered doc settings and maxIndexingThreads to the
defaults
-- set the max threads and max concurrent merges of the mergeScheduler to
be 1 (or very low). This prevented excessive IO during indexing.
-- Only keep one copy of each shard to avoid duplicate writes/merges on the
follower replicas. Add the redundant copies once after the bulk indexing.
-- There was some setting with respect to the storage objects to make them
faster at the expense of more CPU used (not waiting). Helped with indexing,
but not didn't make a difference in the long run.

With regards to SPM. I haven't used it to troubleshoot this type of problem
before, but we use it for all of our solr monitoring. The out-of-the-box
settings work very well for us, so I'm not sure how much metric
customization beyond the initially setup ones it allows.

Also, most of your attachments got filtered out by the mailing list,
particularly the images.

Best,
Chris

On Mon, Apr 23, 2018 at 5:38 PM Denis Demichev  wrote:

> I conducted another experiment today with local SSD drives, but this did
> not seem to fix my problem.
> Don't see any extensive I/O in this case:
>
>
> Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
>
> xvda  1.7688.83 5.521256191  77996
>
> xvdb 13.95   111.30 56663.931573961  801303364
>
> xvdb - is the device where SolrCloud is installed and data files are kept.
>
> What I see:
> - There are 17 "Lucene Merge Thread #..." running. Some of them are
> blocked, some of them are RUNNING
> - updateExecutor-N-thread-M threads are in parked mode and number of docs
> that I am able to submit is still low
> - Tried to change maxIndexingThreads, set it to something high. This seems
> to prolong the time when cluster is accepting new indexing requests and
> keeps CPU utilization a lot higher while the cluster is merging indexes
>
> Could anyone please point me to the right direction (documentation or Java
> classes) where I can read about how data is passed from updateExecutor
> thread pool to Merge Threads? I assume there should be some internal
> blocking queue or something similar.
> Still cannot wrap my head around how Solr blocks incoming connections. Non
> merged indexes are not kept in memory so I don't clearly understand why
> Solr cannot keep writing index file to HDD while other threads are merging
> indexes (since this is a continuous process anyway).
>
> Does anyone use SPM monitoring tool for that type of problems? Is it of
> any use at all?
>
>
> Thank you in advance.
>
> [image: image.png]
>
>
> Regards,
> Denis
>
>
> On Fri, Apr 20, 2018 at 1:28 PM Denis Demichev  wrote:
>
>> Mikhail,
>>
>> Sure, I will keep everyone posted. Moving to non-HVM instance may take
>> some time, so hopefully I will be able to share my observations in the next
>> couple of days or so.
>> Thanks again for all the help.
>>
>> Regards,
>> Denis
>>
>>
>> On Fri, Apr 20, 2018 at 6:02 AM Mikhail Khludnev  wrote:
>>
>>> Denis, please let me know what it ends up with. I'm really curious
>>> regarding this case and AWS instace flavours. fwiw since 7.4 we'll have
>>> ioThrottle=false option.
>>>
>>> On Thu, Apr 19, 2018 at 11:06 PM, Denis Demichev 
>>> wrote:
>>>
 Mikhail, Erick,

 Thank you.

 What just occurred to me - we don't use local SSD but instead we're
 using EBS volumes.
 This was a wrong instance type that I looked at.
 Will try to set up a cluster with SSD nodes and retest.

 Regards,
 Denis


 On Thu, Apr 19, 2018 at 2:56 PM Mikhail 

IndexFetcher checksums don't match

2018-04-24 Thread Markus Jelsma
Hello,

After a failed log replay (it got a ClassCastException) with 7.2.1 it seems 
Solr tries to haul over a 50 GB index from another replica. While doing so, it 
throws a good number of checksum warnings.

Why don't the checksums match? Can i safely ignore them? Do i need to do 
something about it?

Many thanks,
Markus

File _kg3.nvm did not match. expected checksum is 54176987 and actual is 
checksum 1724417924. expected length is 1288 and actual length is 1288
4/24/2018, 2:13:53 PM   
File _kg3.fnm did not match. expected checksum is 3553991974 and actual is 
checksum 2803425346. expected length is 20109 and actual length is 20109
4/24/2018, 2:13:53 PM
File _kg3.tvd did not match. expected checksum is 199310828 and actual is 
checksum 1010028704. expected length is 1824094293 and actual length is 
1824094293


ClassCastException: o.a.l.d.Field cannot be cast to o.a.l.d.StoredField

2018-04-24 Thread Markus Jelsma
Hello,

We have a DocumentTransformer that gets a Field from the SolrDocument and casts 
it to StoredField (although aparently we don't need to cast). This works well 
in tests and fine in production, except for some curious, unknown and 
unreproducible, cases, throwing the ClassCastException.

I can, and will, just remove the cast to fix the rare exception, but in what 
cases could the exception get thrown?

Many thanks,
Markus


Re: Learning to Rank (LTR) with grouping

2018-04-24 Thread Alessandro Benedetti
Are you using SolrCloud or any distributed search ?

If you are using just a single Solr instance, LTR should have no problem
with pagination.
The re-rank involves the top K and then you paginate.
So if a document from the original score page 1 ends up in page 3, you will
see it at page three.
have you verified that : "Say, if an item (Y) from second page is moved to
first page after 
re-ranking, while an item (X) from first page is moved away from the first 
page.  ?" 
Top K shouldn't start from the "start" parameter, if it does, it is a bug.

The situation change a little with distributed search where you can
experiment this behaviour : 

*Pagination*
Let’s explore the scenario on a single Solr node and on a sharded
architecture.

SINGLE SOLR NODE

reRankDocs=15
rows=10
This means each page is composed by 10 results.
What happens when we hit the page 2 ?
The first 5 documents in the search results will have been rescored and
affected by the reranking.
The latter 5 documents will preserve the original score and original
ranking.

e.g.
Doc 11 – score= 1.2
Doc 12 – score= 1.1
Doc 13 – score= 1.0
Doc 14 – score= 0.9
Doc 15 – score= 0.8
Doc 16 – score= 5.7
Doc 17 – score= 5.6
Doc 18 – score= 5.5
Doc 19 – score= 4.6
Doc 20 – score= 2.4
This means that score(15) could be < score(16), but document 15 and 16 are
still in the expected order.
The reason is that the top 15 documents are rescored and reranked and the
rest is left unchanged.

*SHARDED ARCHITECTURE*

reRankDocs=15
rows=10
Shards number=2
When looking for the page 2, Solr will trigger queries to she shards to
collect 2 pages per shard :
Shard1 : 10 ReRanked docs (page1) + 5 ReRanked docs + 5 OriginalScored docs
(page2)
Shard2 : 10 ReRanked docs (page1) + 5 ReRanked docs + 5 OriginalScored docs
(page2)

The the results will be merged, and possibly, original scored search results
can top up reranked docs.
A possible solution could be to normalise the scores to prevent any
possibility that a reranked result is surpassed by original scored ones.

Note: The problem is going to happen after you reach rows * page >
reRankDocs. In situations when reRankDocs is quite high , the problem will
occur only in deep paging.



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: SolrCloud DIH (Data Import Handler) MySQL 404

2018-04-24 Thread msaunier
I have modify DIH definition to simplify but sames errors:

## indexation_events.xml











##

Maxence,





-Message d'origine-
De : msaunier [mailto:msaun...@citya.com] 
Envoyé : mardi 24 avril 2018 10:04
À : solr-user@lucene.apache.org
Objet : RE: SolrCloud DIH (Data Import Handler) MySQL 404

If I access to the interface, I have a null pointer exception:

null:java.lang.NullPointerException
at 
org.apache.solr.handler.RequestHandlerBase.getVersion(RequestHandlerBase.java:233)
at 
org.apache.solr.handler.admin.SolrInfoMBeanHandler.addMBean(SolrInfoMBeanHandler.java:187)
at 
org.apache.solr.handler.admin.SolrInfoMBeanHandler.getMBeanInfo(SolrInfoMBeanHandler.java:163)
at 
org.apache.solr.handler.admin.SolrInfoMBeanHandler.handleRequestBody(SolrInfoMBeanHandler.java:80)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)





-Message d'origine-
De : msaunier [mailto:msaun...@citya.com] Envoyé : mardi 24 avril 2018 09:25 À 
: solr-user@lucene.apache.org Objet : RE: SolrCloud DIH (Data Import Handler) 
MySQL 404

Hello Shawn,
Thanks for your answers. 

#
So, indexation_events.xml file is:














































#
And the config file is the configoverlay.xml, it's in cloud:

{
  "updateProcessor":{},

  "runtimeLib":{

RE: SolrCloud DIH (Data Import Handler) MySQL 404

2018-04-24 Thread msaunier
If I access to the interface, I have a null pointer exception:

null:java.lang.NullPointerException
at 
org.apache.solr.handler.RequestHandlerBase.getVersion(RequestHandlerBase.java:233)
at 
org.apache.solr.handler.admin.SolrInfoMBeanHandler.addMBean(SolrInfoMBeanHandler.java:187)
at 
org.apache.solr.handler.admin.SolrInfoMBeanHandler.getMBeanInfo(SolrInfoMBeanHandler.java:163)
at 
org.apache.solr.handler.admin.SolrInfoMBeanHandler.handleRequestBody(SolrInfoMBeanHandler.java:80)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)





-Message d'origine-
De : msaunier [mailto:msaun...@citya.com] 
Envoyé : mardi 24 avril 2018 09:25
À : solr-user@lucene.apache.org
Objet : RE: SolrCloud DIH (Data Import Handler) MySQL 404

Hello Shawn,
Thanks for your answers. 

#
So, indexation_events.xml file is:














































#
And the config file is the configoverlay.xml, it's in cloud:

{
  "updateProcessor":{},

  "runtimeLib":{
"mysql-connector-java":{
  "name":"mysql-connector-java",
  "version":1},

"data-import-handler":{
  "name":"data-import-handler",
  "version":1}},

  "requestHandler":{"/test_dih":{
  "name":"/test_dih",
  "class":"org.apache.solr.handler.dataimport.DataImportHandler",
  "runtimeLib":true,
  "version":1,
  "defaults":{"config":"DIH/indexation_events.xml"}}}
}

I go to regard the 

RE: SolrCloud DIH (Data Import Handler) MySQL 404

2018-04-24 Thread msaunier
Hello Shawn,
Thanks for your answers. 

#
So, indexation_events.xml file is:














































#
And the config file is the configoverlay.xml, it's in cloud:

{
  "updateProcessor":{},

  "runtimeLib":{
"mysql-connector-java":{
  "name":"mysql-connector-java",
  "version":1},

"data-import-handler":{
  "name":"data-import-handler",
  "version":1}},

  "requestHandler":{"/test_dih":{
  "name":"/test_dih",
  "class":"org.apache.solr.handler.dataimport.DataImportHandler",
  "runtimeLib":true,
  "version":1,
  "defaults":{"config":"DIH/indexation_events.xml"}}}
}

I go to regard the solr.log

Thanks,
Maxence





-Message d'origine-
De : Shawn Heisey [mailto:apa...@elyograg.org] 
Envoyé : lundi 23 avril 2018 18:28
À : solr-user@lucene.apache.org
Objet : Re: SolrCloud DIH (Data Import Handler) MySQL 404

On 4/23/2018 8:30 AM, msaunier wrote:
> I have add debug:
>
> curl
> "http://srv-formation-solr:8983/solr/arguments_test/test_dih?command=f
> ull-im
> port=true=true"
>name="responseHeader">500 name="QTime">588 name="runtimeLib">true1 name="defaults"> name="config">DIH/indexation_events.xml

Re: versions of documentation: suggestion for improvement

2018-04-24 Thread Arturas Mazeika
Hi Hoss et al,

Thanks for the prompt answer and the links. I see there's quite some
interesting discussions around the issue already. Let me take some time to
get into details.

I also noticed that there's the concept of "latest" (similar to "current"
in postgres documentation) in solr. This is pretty cool. I am afraid
though, that this currently is somewhat confusing. E.g., if I search for
managed schema in google I get this as 1st url:

https://lucene.apache.org/solr/guide/6_6/schema-factory-definition-in-solrconfig.html

now if I try to replace 6_6 with latest:

https://lucene.apache.org/solr/guide/latest/schema-factory-definition-in-solrconfig.html

I am redirected to

https://lucene.apache.org/solr/guide/7_3/latest/schema-factory-definition-in-solrconfig.html

(with "latest" in the url, which, of course, does not exist).

But let me meditate on this a bit.

@Erik: good point w.r.t PDF :-)

Cheers,
Arturas


On Tue, Apr 24, 2018 at 1:49 AM, Chris Hostetter 
wrote:

>
>
> There's been some discussion along the lines of doing some things like
> what you propose which were spun out of discussion in SOLR-10595 into the
> issue LUCENE-7924 ... but so far no one has attempted the
> tooling/scripting work needed to make it happen.
>
> Pathes certainly welcome.
>
>
>
> : Date: Mon, 23 Apr 2018 09:55:35 +0200
> : From: Arturas Mazeika 
> : Reply-To: solr-user@lucene.apache.org
> : To: solr-user@lucene.apache.org
> : Subject: versions of documentation: suggestion for improvement
> :
> : Hi Solr-Team,
> :
> : If I google for specific features for solr, I usually get redirected to
> 6.6
> : version of the documentation, like this one:
> :
> : https://lucene.apache.org/solr/guide/6_6/overview-of-
> documents-fields-and-schema-design.html
> :
> : Since I am playing with 7.2 version of solr, I almost always need to
> change
> : this manually through to:
> :
> : https://lucene.apache.org/solr/guide/7_2/overview-of-
> documents-fields-and-schema-design.html
> :
> : (by clicking on the url, going to the number, and replacing two
> : characters). This is somewhat cumbersome (especially after the first
> dozen
> : of changes in urls. Suggestion:
> :
> : (1) Would it make sense to include other versions of the document as urls
> : on the page? See, e.g., the following documentation of postgres, where
> each
> : page has a pointer to the same page in different versions:
> :
> : https://www.postgresql.org/docs/9.6/static/sql-createtable.html
> :
> : (especially "This page in other versions: 9.3
> :  / 9.4
> :  / 9.5
> :  /
> *9.6* /
> : current
> : 
> (10
> : )" line
> on
> : the page)
> :
> : (2) Would it make sense in addition to include "current", pointing to the
> : latest current release?
> :
> : This would help to find solr relevant infos from search engines faster.
> :
> : Cheers,
> : Arturas
> :
>
> -Hoss
> http://www.lucidworks.com/
>