date:20190306

Re: DateRangeField, but with Integers instead of dates?

2019-03-06 Thread Mikhail Khludnev

I think https://wiki.apache.org/solr/SpatialForTimeDurations provides a
good guidance.

On Thu, Mar 7, 2019 at 8:53 AM Zheng Lin Edwin Yeo 
wrote:

> Hi,
>
> Do you mean you plan to index a range of number into a single field?
>
> Regards,
> Edwin
>
> On Wed, 6 Mar 2019 at 17:53, Ryan Yacyshyn 
> wrote:
>
> > Hi all,
> >
> > Is there a way I can perform a filter query on a field that contains a
> > numeric range? Very much like how we can query over a date range:
> >
> >
> >
> https://lucene.apache.org/solr/guide/7_3/working-with-dates.html#more-daterangefield-details
> >
> > I'd like to index documents that have a field containing a range of
> > integers, such as:
> >
> > doc 1 --> { my_field: [200 TO 250] }
> > doc 2 --> { my_field: [240 TO 270] }
> >
> > And then have a query such as fq={!field f=my_field op=Contains}[210 TO
> > 230] to return the first doc. This is possible with Dates, can I do
> > something like this with integers?
> >
> > Couldn't find much online on this, if anyone can point me in the right
> > direction that would be great!
> >
> > Ryan
> >
>


-- 
Sincerely yours
Mikhail Khludnev

CVE-2019-0192 Deserialization of untrusted data via jmx.serviceUrl in Apache Solr

2019-03-06 Thread Tomas Fernandez Lobbe

Severity: High

Vendor: The Apache Software Foundation

Versions Affected:
5.0.0 to 5.5.5
6.0.0 to 6.6.5

Description:
ConfigAPI allows to configure Solr's JMX server via an HTTP POST request.
By pointing it to a malicious RMI server, an attacker could take advantage
of Solr's unsafe deserialization to trigger remote code execution on the
Solr side.

Mitigation:
Any of the following are enough to prevent this vulnerability:
* Upgrade to Apache Solr 7.0 or later.
* Disable the ConfigAPI if not in use, by running Solr with the system
property “disable.configEdit=true”
* If upgrading or disabling the Config API are not viable options, apply
patch in [1] and re-compile Solr.
* Ensure your network settings are configured so that only trusted traffic
is allowed to ingress/egress your hosts running Solr.

Credit:
Michael Stepankin

References:
[1] https://issues.apache.org/jira/browse/SOLR-13301
[2] https://wiki.apache.org/solr/SolrSecurity

Re: DateRangeField, but with Integers instead of dates?

2019-03-06 Thread Zheng Lin Edwin Yeo

Hi,

Do you mean you plan to index a range of number into a single field?

Regards,
Edwin

On Wed, 6 Mar 2019 at 17:53, Ryan Yacyshyn  wrote:

> Hi all,
>
> Is there a way I can perform a filter query on a field that contains a
> numeric range? Very much like how we can query over a date range:
>
>
> https://lucene.apache.org/solr/guide/7_3/working-with-dates.html#more-daterangefield-details
>
> I'd like to index documents that have a field containing a range of
> integers, such as:
>
> doc 1 --> { my_field: [200 TO 250] }
> doc 2 --> { my_field: [240 TO 270] }
>
> And then have a query such as fq={!field f=my_field op=Contains}[210 TO
> 230] to return the first doc. This is possible with Dates, can I do
> something like this with integers?
>
> Couldn't find much online on this, if anyone can point me in the right
> direction that would be great!
>
> Ryan
>

Re: Error when trying to create a collection with SolrCloud and Zookeeper Ensemble

2019-03-06 Thread Zheng Lin Edwin Yeo

Hi,

How is your setup like?
Which version of Solr and ZooKeeper are you using? And are you enabling
Basic Authentication and SSL in your Solr?

Regards,
Edwin

On Wed, 6 Mar 2019 at 13:37, maimuna ambareen 
wrote:

> I received the below error when i tried to create a collection with
> SolrCloud
> and zookeepr ensemble.
> Can someone help me to resolve the error?
>
> {
>   "responseHeader":{
> "status":500,
> "QTime":2641},
>   "Operation create caused
>
> exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Error getting replica locations : unable to get autoscaling policy
> session",
>   "exception":{
> "msg":"Error getting replica locations : unable to get autoscaling
> policy session",
> "rspCode":500},
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"Error getting replica locations : unable to get autoscaling
> policy session",
> "trace":"org.apache.solr.common.SolrException: Error getting replica
> locations : unable to get autoscaling policy session\n\tat
>
> org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
>
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:275)\n\tat
>
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:247)\n\tat
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)\n\tat
>
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)\n\tat
>
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)\n\tat
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:496)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)\n\tat
>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
> org.eclipse.jetty.server.Server.handle(Server.java:502)\n\tat
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)\n\tat
>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)\n\tat
> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)\n\tat
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)\n\tat
>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)\n\tat
> java.lang.Thread.run(Thr

TrieDate field in UpdateRequestProcessorChain

2019-03-06 Thread Anil

HI Team,

I am using solr 6.6.2 and my schema includes a date field 'window_time' of
TrieDate. window_time is added to doc id of the solr document using
CloneFieldUpdateProcessorFactory and TruncateFieldUpdateProcessorFactory.

I noticed different date formats in window_time and doc id

"window_time" : *"2019-01-03T12:00:00Z"*
and
"id": 123445-products-*Thu Jan 03 12:00:00 UTC 2019*


i have checked CloneFieldUpdateProcessorFactory and
TruncateFieldUpdateProcessorFactory soruce code, didnt find much
customization there.

Is there any way to keep date format of wind_time and its value format in
ID same ?

Thanks in advance.

Regards,
Anil

Re: Hide BasicAuth JVM param on SOLR admin UI

2019-03-06 Thread Aroop Ganguly

try changing the passwords using the auth api 
https://lucene.apache.org/solr/guide/6_6/basic-authentication-plugin.html#BasicAuthenticationPlugin-AddaUserorEditaPassword
 


That point onwards your credentials will be encrypted on the admin ui.
I do not think your -DbasicAuth password will change but your actual password 
would be different and base64 encrypted.


> On Mar 6, 2019, at 12:22 AM, el mas capo  wrote:
> 
> Hi everyone,
> I am trying to configure Cloud Solr(7.7.0) with basic Authentification. All  
> seems to work nicely, but when I enter on the Web UI I can see the basic Auth 
> Password configured in solr.in.sh in clear format:
> -Dbasicauth=solr:SolrRocks
> Can this behaviour be avoided?
> Thank you by your attention.
>

1969 vs 1960s: not-quite-synonyms in Solr

2019-03-06 Thread Gregg Donovan

For a search like "1969 shirt" I would like to return items with either
1969 or 1960s but boost 1969 items higher. For the query "1960s shirt",
1960s and 1960, 1961, ... 1969 should all match equally.

Is there a standard technique for this? I'm struggling to do this with
eDisMax without adding new fields to the index.

Thanks.

Gregg

Re: RegexReplaceProcessorFactory pattern to detect multiple \n

2019-03-06 Thread Zheng Lin Edwin Yeo

Hi Paul,

I have tried with the first match pattern to be [
\t\x0b\f]*\r?\n, like the configuration below:


   content
   [ \t\x0b\f]*\r?\n
   

   true


   content
   (
){3,}
   


   true


However, the result is still the same as before (previous index results),
with the 4 .

Regards,
Edwin


On Wed, 6 Mar 2019 at 18:23,  wrote:

> Hi Edwin
>
>
>
> You are correct  re the 2nd pattern – my bad. Looking at the 4 , it’s
> actually the sequence «  »? So perhaps the first match
> pattern could be [ \t\x0b\f]*\r?\n
>
>
>
> i.e. [space tab vertical-tab formfeed]
>
>
>
> Regards,
>
> Paul
>
>
>
> Gesendet von Mail für
> Windows 10
>
>
>
> Von: Zheng Lin Edwin Yeo
> Gesendet: Mittwoch, 6. März 2019 07:44
> An: solr-user@lucene.apache.org
> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>
>
>
> Hi Paul,
>
> I have modified the second pattern to be (
){3,}, instead of
> (

){3,}. This pattern of  (

){3,}
> will actually look for 6 or more  instead of 3 ,  as we have put
> the  two times in the pattern, which is the reason that there are more
>  in the result, as cases where there are less than 6  are not being
> replaced, so we ended up having up to 5  in the index.
>
> Modified configuration:
>  
>content
>(
){3,}
>


>true
>  
>
> This will bring us back to the result of the previous index content,
> meaning the issue of having the 4  is still there.
>
> Regards,
> Edwin
>
>
>
> Regards,
> Edwin
>
> On Wed, 6 Mar 2019 at 11:37, Zheng Lin Edwin Yeo 
> wrote:
>
> > Hi Paul,
> >
> > Further to my previous email, which there was an extra "}" in the
> > configuration, I have changed to use the below configuration based on
> your
> > suggestion.
> >
> > 
> >content
> >[ \t]*\r?\n
> >

> >true
> > 
> > 
> >content
> >(

){3,}
> >


> >true
> > 
> >
> > However, the result that I get still has more than 2 . In fact, the
> > result become worse, as you can see from the comparison below.
> >
> > Example 1: The sentence that the regex pattern used to work correctly.
> But
> > with the latest pattern, it has now changed from 2  to become 5 ,
> > which is wrong.
> > *Original content in EML file:*
> > Dear Sir,
> >
> >
> > I am terminating
> > *Original content:*Dear Sir,  \n\n \n \n\n I am terminating
> > *Previous Index content: *Dear Sir,  I am terminating
> > *Current Index content*:   Dear Sir,  I am
> terminating
> >
> > Example 2: The sentence that the above regex pattern is partially working
> > (as you can see, instead of 2 , there are 4 )
> > *Original content in EML file:*
> >
> > *exalted*
> >
> > *Psalm 89:17*
> >
> >
> > 3 Choa Chu Kang Avenue 4
> > *Original content:* exalted  \n \n\n   Psalm 89:17   \n\n   \n\n  3 Choa
> > Chu Kang Avenue 4, Singapore
> > *Previous Index content: *exalted  Psalm 89:17   
> > 3 Choa Chu Kang Avenue 4, Singapore
> > *Current Index content*:Psalm 89:173
> > Choa Chu Kang Avenue 3, Singapor4
> >
> > Example 3: The sentence that the above regex pattern is partially working
> > (as you can see, instead of 2 , there are 4 ). For the latest
> code,
> > there are now 5 
> > *Original content in EML file:*
> >
> > http://www.concorded.com/
> >
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Dec 18, 2018 at 10:07 AM
> > *Original content:* http://www.concorded.com/   \n\n   \n\n \n \n\n \n\n
> > \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n\n \n\n\n  On Tue, Dec 18, 2018 at
> > 10:07 AM
> > *Previous Index content: *http://www.concorded.com/   
> > On Tue, Dec 18, 2018 at 10:07 AM
> > *Current Index content:* http://www.concorded.com/  
> > On Tue, Dec 18, 2018 at 10:07 AM
> >
> >
> > Regards,
> > Edwin
> >
> > On Wed, 6 Mar 2019 at 00:29, Zheng Lin Edwin Yeo 
> > wrote:
> >
> >> Hi Paul,
> >>
> >> Thank you for the reply.
> >>
> >> I have tried to add the following configuration according to your
> >> suggestion:
> >>
> >> 
> >>content
> >>[ \t]*\r?\n}
> >>

> >>true
> >> 
> >>
> >> 
> >>content
> >>(

){3,}
> >>


> >>true
> >> 
> >>
> >> However, none of the \n is being removed this time round.
> >> Is the order and/or the pattern correct?
> >>
> >> Regards,
> >> Edwin
> >>
> >> On Tue, 5 Mar 2019 at 19:54,  wrote:
> >>
> >>> Hi Edwin
> >>>
> >>>
> >>>
> >>> Try for the first pattern/replacement
> >>>
> >>>
> >>>
> >>> [ \t]*\r?\n
> >>>
> >>> 

> >>>
> >>>
> >>>
> >>> Now all line endings and preceding whitespace characters should be
> >>> changed to ‘’.
> >>>
> >>>
> >>>
> >>> The second pattern replacement should replace 3 or more ‘’
> sequences
> >>> to 2 ‘’ sequences:
> >>>
> >>>
> >>>
> >>> (

){3,}
> >>>
> >>> 


> >>>
> >>>
> >>>
> >>> Hope this approach works. Sorry for not replying earlier and best
> >>> regards,
> >>>
> >>> Paul
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Gesendet vo

4 Apache Events in 2019: DC Roadshow soon; next up Chicago, Las Vegas, and Berlin!

2019-03-06 Thread Rich Bowen

Dear Apache Enthusiast,

(You’re receiving this because you are subscribed to one or more user
mailing lists for an Apache Software Foundation project.)

TL;DR:
 * Apache Roadshow DC is in 3 weeks. Register now at
https://apachecon.com/usroadshowdc19/
 * Registration for Apache Roadshow Chicago is open.
http://apachecon.com/chiroadshow19
 * The CFP for ApacheCon North America is now open.
https://apachecon.com/acna19
 * Save the date: ApacheCon Europe will be held in Berlin, October 22nd
through 24th.  https://apachecon.com/aceu19


Registration is open for two Apache Roadshows; these are smaller events
with a more focused program and regional community engagement:

Our Roadshow event in Washington DC takes place in under three weeks, on
March 25th. We’ll be hosting a day-long event at the Fairfax campus of
George Mason University. The roadshow is a full day of technical talks
(two tracks) and an open source job fair featuring AWS, Bloomberg, dito,
GridGain, Linode, and Security University. More details about the
program, the job fair, and to register, visit
https://apachecon.com/usroadshowdc19/

Apache Roadshow Chicago will be held May 13-14th at a number of venues
in Chicago’s Logan Square neighborhood. This event will feature sessions
in AdTech, FinTech and Insurance, startups, “Made in Chicago”, Project
Shark Tank (innovations from the Apache Incubator), community diversity,
and more. It’s a great way to learn about various Apache projects “at
work” while playing at a brewery, a beercade, and a neighborhood bar.
Sign up today at https://www.apachecon.com/chiroadshow19/

We’re delighted to announce that the Call for Presentations (CFP) is now
open for ApacheCon North America in Las Vegas, September 9-13th! As the
official conference series of the ASF, ApacheCon North America will
feature over a dozen Apache project summits, including Cassandra,
Cloudstack, Tomcat, Traffic Control, and more. We’re looking for talks
in a wide variety of categories -- anything related to ASF projects and
the Apache development process. The CFP closes at midnight on May 26th.
In addition, the ASF will be celebrating its 20th Anniversary during the
event. For more details and to submit a proposal for the CFP, visit
https://apachecon.com/acna19/ . Registration will be opening soon.

Be sure to mark your calendars for ApacheCon Europe, which will be held
in Berlin, October 22-24th at the KulturBrauerei, a landmark of Berlin's
industrial history. In addition to innovative content from our projects,
we are collaborating with the Open Source Design community
(https://opensourcedesign.net/) to offer a track on design this year.
The CFP and registration will open soon at https://apachecon.com/aceu19/ .

Sponsorship opportunities are available for all events, with details
listed on each event’s site at http://apachecon.com/.

We look forward to seeing you!

Rich, for the ApacheCon Planners
@apachecon

java.lang.IllegalArgumentException: docID must be >= 0 and < maxDoc=2757277 (got docID=2764367)

2019-03-06 Thread Zubovich Yauheni

Hi,

we are running Solr 7.3 in cloud mode. Last time I noticed we have a lot of
errors in our log like below:

null:java.lang.IllegalArgumentException: docID must be >= 0 and <
maxDoc=2757277 (got docID=2764367)
at 
org.apache.lucene.index.BaseCompositeReader.readerIndex(BaseCompositeReader.java:190)
at 
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:117)
at 
org.apache.solr.search.SolrDocumentFetcher.doc(SolrDocumentFetcher.java:220)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:98)
at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:57)
at c.d.m.i.ProfileResponseWriter.toSolrDocumentList(SolrConverter.java:25)
at c.d.m.i.ProfileResponseWriter.write(ProfileResponseWriter.java:197)
at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:65)
at org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:789)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
...


ProfileResponseWriter code:

import org.apache.solr.response.QueryResponseWriter;import
org.apache.solr.response.ResultContext;import
org.apache.solr.response.SolrQueryResponse;
class ProfileResponseWriterBase implements QueryResponseWriter {

@Override
public void write(Writer writer, SolrQueryRequest request,
SolrQueryResponse response) {
...
SolrDocumentList resp = prepareResp(response, objectWriter, writer);
...
}

private SolrDocumentList prepareResp(SolrQueryResponse response,
ObjectWriter objectWriter, Writer writer) {
...
Object resultContext = response.getValues().get("response");
if (resultContext instanceof ResultContext) {
ResultContext context = ((ResultContext) resultContext);
return toSolrDocumentList(request, context);
}
...
}

private SolrDocumentList toSolrDocumentList(ResultContext resultContext) {
Iterator docIterators =
resultContext.getProcessedDocuments();
SolrDocumentList list = new SolrDocumentList();
list.setNumFound(resultContext.getDocList().matches());
list.setMaxScore(resultContext.getDocList().maxScore());
list.setStart(resultContext.getDocList().offset());
while (docIterators.hasNext()) {
SolrDocument solrDocument = docIterators.next();
//exception appears here
list.add(solrDocument);
}
return list;
}


Could somebody help me to figure out how can I resolve this issue? Is that
a Solr issue, or we have implemented response writer incorrectly?


-- 
Best regards,
Yauheni

Adding rare languages to Apache Solr

2019-03-06 Thread Kay Müller

Hello everybody,
 
My name is Kay and I am a Software Developer from Germany.

I would like to add support for Filipino, Lao, Malaysian and Vietnamese to Solr 
(Solr v4 and Solr v6) and would be delighted to get hints on what 
Tokenizer/Analyzer to use.

It seems that the ICU-Tokenizer seems to have support for Lao, but I was not 
able to find any configuration examples with rules for Lao: 
https://lucene.apache.org/solr/guide/6_6/language-analysis.html#LanguageAnalysis-Hebrew_Lao_Myanmar_Khmer:
 
https://lucene.apache.org/solr/guide/6_6/language-analysis.html#LanguageAnalysis-Hebrew_Lao_Myanmar_Khmer

Any help would be more than welcome! Thanks!!

Kind regards

Kay

Re: LTR feature based on other collection data

2019-03-06 Thread Kamal Kishore Aggarwal

any suggestions ?

Thanks in advance.

On Tue, Feb 26, 2019 at 6:22 PM Kamal Kishore Aggarwal <
kkroyal@gmail.com> wrote:

> I looks to me that I can modify the *SolrFeature *class, but dont know
> how to create IndexSearcher and SolrQueryRequest params as per the new
> request and second collection.
>
> @Override
>   public FeatureWeight createWeight(*IndexSearcher searcher*, boolean
> needsScores,
>   *SolrQueryRequest request*, Query originalQuery,
> Map efi)
>   throws IOException {
> return new SolrFeatureWeight(searcher, request, originalQuery, efi);
>   }
>
> Regards
> Kamal
>
>
> On Tue, Feb 26, 2019 at 12:34 PM Kamal Kishore Aggarwal <
> kkroyal@gmail.com> wrote:
>
>> Hi,
>>
>> I am working on LTR using solr 6.6.2. I am working on custom feature
>> creation. I am able to create few custom features as per our requirement.
>>
>> But, there are certain features, for which the data is stored in other
>> collection. Data like count of clicks, last date when the product was
>> ordered, etc. These type of information is stored in another collection and
>> we are not planning to put this info. in first collection.
>>
>> Now, we need to use the data in other collection to generate the score of
>> the document  in LTR. We are open to develop custom components as well.
>>
>> Is there a way, we can modify our query using some join. But, we know
>> join is expensive.
>>
>> Please suggest. Thanks in advance.
>>
>> Regards
>> Kamal Kishore
>>
>

Re: Trying to enable HTTP gzip compression

2019-03-06 Thread Luthien Dulk

Hi Walter,

You’re right, this is going nowhere. 
We thought that the bottleneck might be the http connection between the API 
running on Cloud Foundry, and the Solr Cluster on an external host. 
Potentially saving bandwidth on that seemed (at the first glance) like a too 
good option to not look into.

I just would have liked to see that confirmed in a performance test, but after 
discussing it again with our sysadmin I don’t want to waste any more of his or 
my time trying to make that work. 
Ah well, at least I learned a few things about Solr :)

Thanks for your comments. 

Lúthien

> On 27 Feb 2019, at 17:08, Walter Underwood  wrote:
> 
> I really do not expect it to make anything faster. I think you are wasting 
> your time. Compression also adds some latency because the compression happens 
> before data is sent out. 
> 
> If your CPUs are idle, that is a red flag for performance. In every one of 
> our clusters, CPU is the limiting factor in both latency and throughput. Our 
> largest production cluster is 32 nodes, each with 36 CPUs.
> 
> Where is the bottleneck? Are the processes waiting on disk? If they are, you 
> need more RAM. Do you have magnetic disks? Get SSDs.
> 
> You should have enough RAM to hold the index in memory, after allowing for 
> the Solr JVM, kernel, and other processes.

-- 
Disclaimer: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they 
are
addressed. If you have received this email in error please notify the 
system manager. If you are not the named addressee you should not 
disseminate,
distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete 
this email from your
system.

AW: RegexReplaceProcessorFactory pattern to detect multiple \n

2019-03-06 Thread paul.dodd

Hi Edwin



You are correct  re the 2nd pattern – my bad. Looking at the 4 , it’s 
actually the sequence «  »? So perhaps the first match pattern 
could be [ \t\x0b\f]*\r?\n



i.e. [space tab vertical-tab formfeed]



Regards,

Paul



Gesendet von Mail für Windows 10



Von: Zheng Lin Edwin Yeo
Gesendet: Mittwoch, 6. März 2019 07:44
An: solr-user@lucene.apache.org
Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n



Hi Paul,

I have modified the second pattern to be (
){3,}, instead of
(

){3,}. This pattern of  (

){3,}
will actually look for 6 or more  instead of 3 ,  as we have put
the  two times in the pattern, which is the reason that there are more
 in the result, as cases where there are less than 6  are not being
replaced, so we ended up having up to 5  in the index.

Modified configuration:
 
   content
   (
){3,}
   


   true
 

This will bring us back to the result of the previous index content,
meaning the issue of having the 4  is still there.

Regards,
Edwin



Regards,
Edwin

On Wed, 6 Mar 2019 at 11:37, Zheng Lin Edwin Yeo 
wrote:

> Hi Paul,
>
> Further to my previous email, which there was an extra "}" in the
> configuration, I have changed to use the below configuration based on your
> suggestion.
>
> 
>content
>[ \t]*\r?\n
>

>true
> 
> 
>content
>(

){3,}
>


>true
> 
>
> However, the result that I get still has more than 2 . In fact, the
> result become worse, as you can see from the comparison below.
>
> Example 1: The sentence that the regex pattern used to work correctly. But
> with the latest pattern, it has now changed from 2  to become 5 ,
> which is wrong.
> *Original content in EML file:*
> Dear Sir,
>
>
> I am terminating
> *Original content:*Dear Sir,  \n\n \n \n\n I am terminating
> *Previous Index content: *Dear Sir,  I am terminating
> *Current Index content*:   Dear Sir,  I am terminating
>
> Example 2: The sentence that the above regex pattern is partially working
> (as you can see, instead of 2 , there are 4 )
> *Original content in EML file:*
>
> *exalted*
>
> *Psalm 89:17*
>
>
> 3 Choa Chu Kang Avenue 4
> *Original content:* exalted  \n \n\n   Psalm 89:17   \n\n   \n\n  3 Choa
> Chu Kang Avenue 4, Singapore
> *Previous Index content: *exalted  Psalm 89:17   
> 3 Choa Chu Kang Avenue 4, Singapore
> *Current Index content*:Psalm 89:173
> Choa Chu Kang Avenue 3, Singapor4
>
> Example 3: The sentence that the above regex pattern is partially working
> (as you can see, instead of 2 , there are 4 ). For the latest code,
> there are now 5 
> *Original content in EML file:*
>
> http://www.concorded.com/
>
>
>
>
>
>
>
>
> On Tue, Dec 18, 2018 at 10:07 AM
> *Original content:* http://www.concorded.com/   \n\n   \n\n \n \n\n \n\n
> \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n\n \n\n\n  On Tue, Dec 18, 2018 at
> 10:07 AM
> *Previous Index content: *http://www.concorded.com/   
> On Tue, Dec 18, 2018 at 10:07 AM
> *Current Index content:* http://www.concorded.com/  
> On Tue, Dec 18, 2018 at 10:07 AM
>
>
> Regards,
> Edwin
>
> On Wed, 6 Mar 2019 at 00:29, Zheng Lin Edwin Yeo 
> wrote:
>
>> Hi Paul,
>>
>> Thank you for the reply.
>>
>> I have tried to add the following configuration according to your
>> suggestion:
>>
>> 
>>content
>>[ \t]*\r?\n}
>>

>>true
>> 
>>
>> 
>>content
>>(

){3,}
>>


>>true
>> 
>>
>> However, none of the \n is being removed this time round.
>> Is the order and/or the pattern correct?
>>
>> Regards,
>> Edwin
>>
>> On Tue, 5 Mar 2019 at 19:54,  wrote:
>>
>>> Hi Edwin
>>>
>>>
>>>
>>> Try for the first pattern/replacement
>>>
>>>
>>>
>>> [ \t]*\r?\n
>>>
>>> 

>>>
>>>
>>>
>>> Now all line endings and preceding whitespace characters should be
>>> changed to ‘’.
>>>
>>>
>>>
>>> The second pattern replacement should replace 3 or more ‘’ sequences
>>> to 2 ‘’ sequences:
>>>
>>>
>>>
>>> (

){3,}
>>>
>>> 


>>>
>>>
>>>
>>> Hope this approach works. Sorry for not replying earlier and best
>>> regards,
>>>
>>> Paul
>>>
>>>
>>>
>>>
>>>
>>> Gesendet von Mail für
>>> Windows 10
>>>
>>>
>>>
>>> Von: Zheng Lin Edwin Yeo
>>> Gesendet: Dienstag, 5. März 2019 03:35
>>> An: solr-user@lucene.apache.org
>>> Betreff: Re: RegexReplaceProcessorFactory pattern to detect multiple \n
>>>
>>>
>>>
>>> Hi,
>>>
>>> For your info, this issue is occurring in the new Solr 7.7.1 as well.
>>>
>>> Regards,
>>> Edwin
>>>
>>> On Mon, 25 Feb 2019 at 10:28, Zheng Lin Edwin Yeo 
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > Anyone else has other suggestions or have faced the same problem?
>>> >
>>> > Regards,
>>> > Edwin
>>> >
>>> > On Wed, 20 Feb 2019 at 16:58, Zheng Lin Edwin Yeo <
>>> edwinye...@gmail.com>
>>> > wrot

DateRangeField, but with Integers instead of dates?

2019-03-06 Thread Ryan Yacyshyn

Hi all,

Is there a way I can perform a filter query on a field that contains a
numeric range? Very much like how we can query over a date range:

https://lucene.apache.org/solr/guide/7_3/working-with-dates.html#more-daterangefield-details

I'd like to index documents that have a field containing a range of
integers, such as:

doc 1 --> { my_field: [200 TO 250] }
doc 2 --> { my_field: [240 TO 270] }

And then have a query such as fq={!field f=my_field op=Contains}[210 TO
230] to return the first doc. This is possible with Dates, can I do
something like this with integers?

Couldn't find much online on this, if anyone can point me in the right
direction that would be great!

Ryan

Hide BasicAuth JVM param on SOLR admin UI

2019-03-06 Thread el mas capo

Hi everyone,
I am trying to configure Cloud Solr(7.7.0) with basic Authentification. All  
seems to work nicely, but when I enter on the Web UI I can see the basic Auth 
Password configured in solr.in.sh in clear format:
-Dbasicauth=solr:SolrRocks
Can this behaviour be avoided?
Thank you by your attention.

Re: Run solrj in parallel how it works

2019-03-06 Thread Mikhail Khludnev

To be able to run it in parallel one needs to copy DataImpoortHandler
definition to /dataimport2,
Otherwise it rejects the second parallel attempt. Sic.

On Tue, Mar 5, 2019 at 1:08 PM sami  wrote:

> I am little bit confused about the parallel running option for solrj. How
> to
> configure the core and what it means exactly. Right now, i create a new
> core
> with Solr admin console. the main requirement is to have a conf folder with
> defined solrconfig.xml and data-config.xml. Now, i run my program as
> discussed earlier here.
>
>
> http://lucene.472066.n3.nabble.com/Index-database-with-SolrJ-using-xml-file-directly-throws-an-error-td4426491.html
>
> It works fine. Now, when i want to define the query in my data-config.xml
> file to be based on date what i mean exactly...
>
> i have a database for year 2016 with months. i index half of it with
> data-config1.xml and other half with data-config2.xml file. can i just run
> the solrj programs two times with two seperate xml files and it will index
> all of my data to one core.
>
> String url = "http://localhost:8983/solr/test";;
> HttpSolrClient server = new
> HttpSolrClient.Builder(url).build();
> ModifiableSolrParams params = new ModifiableSolrParams();
> params.set("qt", "/dataimport");
> params.set("command", "full-import");
> params.set("clean", "true");
> params.set("commit", "true");
> params.set("optimize", "true");
> params.set("config","data-config1.xml");
> server.query(params);
>
> and running this program again as
>
> String url = "http://localhost:8983/solr/test";;
> HttpSolrClient server = new
> HttpSolrClient.Builder(url).build();
> ModifiableSolrParams params = new ModifiableSolrParams();
> params.set("qt", "/dataimport");
> params.set("command", "full-import");
> params.set("clean", "true");
> params.set("commit", "true");
> params.set("optimize", "true");
> params.set("config","data-config2.xml");
> server.query(params);
>
> Can i run these 2 simultaneously. it will be all indexed on core test? and
> query will work all fine. Can someone explain a bit here.
>
> Can i use multi-threading concept of java here, if yes, how? little bit
> more
> elaboration.
>
> Thanks in adavance!
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
Sincerely yours
Mikhail Khludnev

Re: DateRangeField, but with Integers instead of dates?

CVE-2019-0192 Deserialization of untrusted data via jmx.serviceUrl in Apache Solr

Re: DateRangeField, but with Integers instead of dates?

Re: Error when trying to create a collection with SolrCloud and Zookeeper Ensemble

TrieDate field in UpdateRequestProcessorChain

Re: Hide BasicAuth JVM param on SOLR admin UI

1969 vs 1960s: not-quite-synonyms in Solr

Re: RegexReplaceProcessorFactory pattern to detect multiple \n

4 Apache Events in 2019: DC Roadshow soon; next up Chicago, Las Vegas, and Berlin!

java.lang.IllegalArgumentException: docID must be >= 0 and < maxDoc=2757277 (got docID=2764367)

Adding rare languages to Apache Solr

Re: LTR feature based on other collection data

Re: Trying to enable HTTP gzip compression

AW: RegexReplaceProcessorFactory pattern to detect multiple \n

DateRangeField, but with Integers instead of dates?

Hide BasicAuth JVM param on SOLR admin UI

Re: Run solrj in parallel how it works

17 matches

Site Navigation

Mail list logo

Footer information