date:20180723

Anthill Inside and The Fifth Elephant Bengaluru India 2018 Edition

2018-07-23 Thread Amrit Sarkar

*Anthill Inside and The Fifth Elephant -- HasGeek’s marquee annual
conferences -- bring together business decision makers, data engineers,
architects, data scientists and product managers – to understand nuances of
managing and leveraging data. And what’s more, Solr community members can
avail a 10% discount on the conference tickets by visiting these
links!Anthill Inside: https://anthillinside.in/2018/?code=SG65IC
 The Fifth Elephant:
https://fifthelephant.in/2018/?code=SG65IC
Both conferences have been
produced by the community, for the community. They cover the theoretical
and practical applications of machine learning, deep learning and
artificial intelligence, and data collection and other implementation steps
towards building these systems.Anthill InsideAnthill Inside stitches the
gap between research and industry, bringing in speakers from both the
worlds in equal representation. Engage in nuanced, open discussions on
topics like privacy and ethics in AI to breaking down components of real
world systems into hubs-and-spokes. Hear about organizational issues like
what machine learning can and cannot do for your organization to deeper
technical issues like how to build classification systems in absence of
large datasets.Anthill Inside: 25 July Registration link with 10% discount
on conference: https://anthillinside.in/2018/?code=SG65IC
The Fifth ElephantApplications
of techniques and uses of data to build product features is the primary
flavour of The Fifth Elephant 2018. The wide variety of topics include:1.
Designing systems for data (hint: it’s not only about the algorithms)a. How
poor design can lower data quality, which in turn will compromise the
entire project: a case study on AADHAAR b. How any data, even as meek as
electoral, can be weaponized against the users (voters in this case) 2.
Privacy issues with dataa. The right to be forgotten: problems with data
systems b. The right to privacy vs the right to information: way forward 3.
Handling super large scale data systems 4. Data visualization at scale,
like at Uber for self-driving carsAlong with talks at the venue, there are
open discussions on privacy and open data, and workshops on Amazon
SageMaker (26 July) and recommendations using TensorFlow (27 July).The
Fifth Elephant: 26 and 27 July Registration link with 10% discount on
conference: https://fifthelephant.in/2018/?code=SG65IC
For more details about any of
these, write to i...@hasgeek.com  or call 7676332020.*

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
Medium: https://medium.com/@sarkaramrit2

Re: Solr fails even ZK quorum has majority

2018-07-23 Thread Shalin Shekhar Mangar

Can you please open a Jira issue? I don't think we handle DNS problems very
well during startup. Thanks.

On Tue, Jul 24, 2018 at 2:31 AM Susheel Kumar  wrote:

> Something messed up with DNS which resulted into unknown host exception for
> one the machines in our env and caused Solr to throw the above exception
>
>  Eric,  I have the Solr configured using service installation script and
> the ZK_HOST entry in
> solr.in.sh="server1:2181,server2:2181,server3:2181/collection"
> and after removing the server1 from above, was able to start Solr otherwise
> it was throwing above exception.
>
> Thnx
>
>
> On Mon, Jul 23, 2018 at 4:20 PM, Erick Erickson 
> wrote:
>
> > And how do you start Solr? Do you use the entire 3-node ensemble address?
> >
> > On Mon, Jul 23, 2018 at 12:55 PM, Michael Braun 
> wrote:
> > > Per the exception, this looks like a network / DNS resolution issue,
> > > independent of Solr and Zookeeper code:
> > >
> > > Caused by: org.apache.solr.common.SolrException:
> > > java.net.UnknownHostException: ditsearch001.es.com: Name or service
> not
> > > known
> > >
> > > Is this address actually resolvable at the time?
> > >
> > > On Mon, Jul 23, 2018 at 3:46 PM, Susheel Kumar 
> > > wrote:
> > >
> > >> In usual circumstances when one Zookeeper goes down while others 2 are
> > up,
> > >> Solr continues to operate but when one of the ZK machine was not
> > reachable
> > >> with ping returning below results, Solr count't starts.  See stack
> trace
> > >> below
> > >>
> > >> ping: cannot resolve ditsearch001.es.com: Unknown host
> > >>
> > >>
> > >> Setup: Solr 6.6.2 and Zookeeper 3.4.10
> > >>
> > >> I had to remove this server name from the ZK_HOST list (solr.in.sh)
> in
> > >> order to get Solr started. Ideally whatever issue is there as far as
> > >> majority is there, Solr should get started.
> > >>
> > >> Has any one noticed this issue?
> > >>
> > >> Thnx
> > >>
> > >> 2018-07-23 15:30:47.218 INFO  (main) [   ] o.e.j.s.Server
> > >> jetty-9.3.14.v20161028
> > >>
> > >> 2018-07-23 15:30:47.817 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> > ___
> > >> _   Welcome to Apache Solr‚Ñ¢ version 6.6.2
> > >>
> > >> 2018-07-23 15:30:47.829 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> /
> > __|
> > >> ___| |_ _   Starting in cloud mode on port 8080
> > >>
> > >> 2018-07-23 15:30:47.830 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> > \__
> > >> \/ _ \ | '_|  Install dir: /opt/solr
> > >>
> > >> 2018-07-23 15:30:47.861 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> > >> |___/\___/_|_|Start time: 2018-07-23T15:30:47.832Z
> > >>
> > >> 2018-07-23 15:30:47.863 INFO  (main) [   ] o.a.s.s.StartupLoggingUtils
> > >> Property solr.log.muteconsole given. Muting ConsoleAppender named
> > CONSOLE
> > >>
> > >> 2018-07-23 15:30:47.929 INFO  (main) [   ] o.a.s.c.SolrResourceLoader
> > Using
> > >> system property solr.solr.home: /app/solr/data
> > >>
> > >> 2018-07-23 15:30:48.037 ERROR (main) [   ] o.a.s.s.SolrDispatchFilter
> > Could
> > >> not start Solr. Check solr/home property and the logs
> > >>
> > >> 2018-07-23 15:30:48.235 ERROR (main) [   ] o.a.s.c.SolrCore
> > >> null:org.apache.solr.common.SolrException: Error occurred while
> loading
> > >> solr.xml from zookeeper
> > >>
> > >> at
> > >> org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(
> > >> SolrDispatchFilter.java:270)
> > >>
> > >> at
> > >> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(
> > >> SolrDispatchFilter.java:242)
> > >>
> > >> at
> > >> org.apache.solr.servlet.SolrDispatchFilter.init(
> > >> SolrDispatchFilter.java:173)
> > >>
> > >> at
> > >> org.eclipse.jetty.servlet.FilterHolder.initialize(
> > FilterHolder.java:137)
> > >>
> > >> at
> > >> org.eclipse.jetty.servlet.ServletHandler.initialize(
> > >> ServletHandler.java:873)
> > >>
> > >> at
> > >> org.eclipse.jetty.servlet.ServletContextHandler.startContext(
> > >> ServletContextHandler.java:349)
> > >>
> > >> at
> > >> org.eclipse.jetty.webapp.WebAppContext.startWebapp(
> > >> WebAppContext.java:1404)
> > >>
> > >> at
> > >> org.eclipse.jetty.webapp.WebAppContext.startContext(
> > >> WebAppContext.java:1366)
> > >>
> > >> at
> > >> org.eclipse.jetty.server.handler.ContextHandler.
> > >> doStart(ContextHandler.java:778)
> > >>
> > >> at
> > >> org.eclipse.jetty.servlet.ServletContextHandler.doStart(
> > >> ServletContextHandler.java:262)
> > >>
> > >> at
> > >> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:520)
> > >>
> > >> at
> > >> org.eclipse.jetty.util.component.AbstractLifeCycle.
> > >> start(AbstractLifeCycle.java:68)
> > >>
> > >> at
> > >> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(
> > >> StandardStarter.java:41)
> > >>
> > >> at
> > >> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(
> > AppLifeCycle.java:188)
> > >>
> > >> at
> > >>

How to use tika-OCR in data import handler?

2018-07-23 Thread Yasufumi Mizoguchi

Hi,

I am trying to use tika-OCR(Tesseract) in data import handler
and found that processing English documents was quite good.

But I am struggling to process the other languages such as
Japanese, Chinese, etc...

So, I want to know how to switch Tesseract-OCR's processing
language via data import handler config or tikaConfig param.

Any points would be appreciated.

Thanks,
Yasufumi

Re: Solr fails even ZK quorum has majority

2018-07-23 Thread Susheel Kumar

Something messed up with DNS which resulted into unknown host exception for
one the machines in our env and caused Solr to throw the above exception

 Eric,  I have the Solr configured using service installation script and
the ZK_HOST entry in
solr.in.sh="server1:2181,server2:2181,server3:2181/collection"
and after removing the server1 from above, was able to start Solr otherwise
it was throwing above exception.

Thnx


On Mon, Jul 23, 2018 at 4:20 PM, Erick Erickson 
wrote:

> And how do you start Solr? Do you use the entire 3-node ensemble address?
>
> On Mon, Jul 23, 2018 at 12:55 PM, Michael Braun  wrote:
> > Per the exception, this looks like a network / DNS resolution issue,
> > independent of Solr and Zookeeper code:
> >
> > Caused by: org.apache.solr.common.SolrException:
> > java.net.UnknownHostException: ditsearch001.es.com: Name or service not
> > known
> >
> > Is this address actually resolvable at the time?
> >
> > On Mon, Jul 23, 2018 at 3:46 PM, Susheel Kumar 
> > wrote:
> >
> >> In usual circumstances when one Zookeeper goes down while others 2 are
> up,
> >> Solr continues to operate but when one of the ZK machine was not
> reachable
> >> with ping returning below results, Solr count't starts.  See stack trace
> >> below
> >>
> >> ping: cannot resolve ditsearch001.es.com: Unknown host
> >>
> >>
> >> Setup: Solr 6.6.2 and Zookeeper 3.4.10
> >>
> >> I had to remove this server name from the ZK_HOST list (solr.in.sh) in
> >> order to get Solr started. Ideally whatever issue is there as far as
> >> majority is there, Solr should get started.
> >>
> >> Has any one noticed this issue?
> >>
> >> Thnx
> >>
> >> 2018-07-23 15:30:47.218 INFO  (main) [   ] o.e.j.s.Server
> >> jetty-9.3.14.v20161028
> >>
> >> 2018-07-23 15:30:47.817 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> ___
> >> _   Welcome to Apache Solr‚Ñ¢ version 6.6.2
> >>
> >> 2018-07-23 15:30:47.829 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter /
> __|
> >> ___| |_ _   Starting in cloud mode on port 8080
> >>
> >> 2018-07-23 15:30:47.830 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> \__
> >> \/ _ \ | '_|  Install dir: /opt/solr
> >>
> >> 2018-07-23 15:30:47.861 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> >> |___/\___/_|_|Start time: 2018-07-23T15:30:47.832Z
> >>
> >> 2018-07-23 15:30:47.863 INFO  (main) [   ] o.a.s.s.StartupLoggingUtils
> >> Property solr.log.muteconsole given. Muting ConsoleAppender named
> CONSOLE
> >>
> >> 2018-07-23 15:30:47.929 INFO  (main) [   ] o.a.s.c.SolrResourceLoader
> Using
> >> system property solr.solr.home: /app/solr/data
> >>
> >> 2018-07-23 15:30:48.037 ERROR (main) [   ] o.a.s.s.SolrDispatchFilter
> Could
> >> not start Solr. Check solr/home property and the logs
> >>
> >> 2018-07-23 15:30:48.235 ERROR (main) [   ] o.a.s.c.SolrCore
> >> null:org.apache.solr.common.SolrException: Error occurred while loading
> >> solr.xml from zookeeper
> >>
> >> at
> >> org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(
> >> SolrDispatchFilter.java:270)
> >>
> >> at
> >> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(
> >> SolrDispatchFilter.java:242)
> >>
> >> at
> >> org.apache.solr.servlet.SolrDispatchFilter.init(
> >> SolrDispatchFilter.java:173)
> >>
> >> at
> >> org.eclipse.jetty.servlet.FilterHolder.initialize(
> FilterHolder.java:137)
> >>
> >> at
> >> org.eclipse.jetty.servlet.ServletHandler.initialize(
> >> ServletHandler.java:873)
> >>
> >> at
> >> org.eclipse.jetty.servlet.ServletContextHandler.startContext(
> >> ServletContextHandler.java:349)
> >>
> >> at
> >> org.eclipse.jetty.webapp.WebAppContext.startWebapp(
> >> WebAppContext.java:1404)
> >>
> >> at
> >> org.eclipse.jetty.webapp.WebAppContext.startContext(
> >> WebAppContext.java:1366)
> >>
> >> at
> >> org.eclipse.jetty.server.handler.ContextHandler.
> >> doStart(ContextHandler.java:778)
> >>
> >> at
> >> org.eclipse.jetty.servlet.ServletContextHandler.doStart(
> >> ServletContextHandler.java:262)
> >>
> >> at
> >> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:520)
> >>
> >> at
> >> org.eclipse.jetty.util.component.AbstractLifeCycle.
> >> start(AbstractLifeCycle.java:68)
> >>
> >> at
> >> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(
> >> StandardStarter.java:41)
> >>
> >> at
> >> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(
> AppLifeCycle.java:188)
> >>
> >> at
> >> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(
> >> DeploymentManager.java:499)
> >>
> >> at
> >> org.eclipse.jetty.deploy.DeploymentManager.addApp(
> >> DeploymentManager.java:147)
> >>
> >> at
> >> org.eclipse.jetty.deploy.providers.ScanningAppProvider.
> >> fileAdded(ScanningAppProvider.java:180)
> >>
> >> at
> >> org.eclipse.jetty.deploy.providers.WebAppProvider.
> >> fileAdded(WebAppProvider.java:458)
> >>
> >> at
> >>

Re: Solr fails even ZK quorum has majority

2018-07-23 Thread Erick Erickson

And how do you start Solr? Do you use the entire 3-node ensemble address?

On Mon, Jul 23, 2018 at 12:55 PM, Michael Braun  wrote:
> Per the exception, this looks like a network / DNS resolution issue,
> independent of Solr and Zookeeper code:
>
> Caused by: org.apache.solr.common.SolrException:
> java.net.UnknownHostException: ditsearch001.es.com: Name or service not
> known
>
> Is this address actually resolvable at the time?
>
> On Mon, Jul 23, 2018 at 3:46 PM, Susheel Kumar 
> wrote:
>
>> In usual circumstances when one Zookeeper goes down while others 2 are up,
>> Solr continues to operate but when one of the ZK machine was not reachable
>> with ping returning below results, Solr count't starts.  See stack trace
>> below
>>
>> ping: cannot resolve ditsearch001.es.com: Unknown host
>>
>>
>> Setup: Solr 6.6.2 and Zookeeper 3.4.10
>>
>> I had to remove this server name from the ZK_HOST list (solr.in.sh) in
>> order to get Solr started. Ideally whatever issue is there as far as
>> majority is there, Solr should get started.
>>
>> Has any one noticed this issue?
>>
>> Thnx
>>
>> 2018-07-23 15:30:47.218 INFO  (main) [   ] o.e.j.s.Server
>> jetty-9.3.14.v20161028
>>
>> 2018-07-23 15:30:47.817 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter  ___
>> _   Welcome to Apache Solr‚Ñ¢ version 6.6.2
>>
>> 2018-07-23 15:30:47.829 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter / __|
>> ___| |_ _   Starting in cloud mode on port 8080
>>
>> 2018-07-23 15:30:47.830 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter \__
>> \/ _ \ | '_|  Install dir: /opt/solr
>>
>> 2018-07-23 15:30:47.861 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
>> |___/\___/_|_|Start time: 2018-07-23T15:30:47.832Z
>>
>> 2018-07-23 15:30:47.863 INFO  (main) [   ] o.a.s.s.StartupLoggingUtils
>> Property solr.log.muteconsole given. Muting ConsoleAppender named CONSOLE
>>
>> 2018-07-23 15:30:47.929 INFO  (main) [   ] o.a.s.c.SolrResourceLoader Using
>> system property solr.solr.home: /app/solr/data
>>
>> 2018-07-23 15:30:48.037 ERROR (main) [   ] o.a.s.s.SolrDispatchFilter Could
>> not start Solr. Check solr/home property and the logs
>>
>> 2018-07-23 15:30:48.235 ERROR (main) [   ] o.a.s.c.SolrCore
>> null:org.apache.solr.common.SolrException: Error occurred while loading
>> solr.xml from zookeeper
>>
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(
>> SolrDispatchFilter.java:270)
>>
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(
>> SolrDispatchFilter.java:242)
>>
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.init(
>> SolrDispatchFilter.java:173)
>>
>> at
>> org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:137)
>>
>> at
>> org.eclipse.jetty.servlet.ServletHandler.initialize(
>> ServletHandler.java:873)
>>
>> at
>> org.eclipse.jetty.servlet.ServletContextHandler.startContext(
>> ServletContextHandler.java:349)
>>
>> at
>> org.eclipse.jetty.webapp.WebAppContext.startWebapp(
>> WebAppContext.java:1404)
>>
>> at
>> org.eclipse.jetty.webapp.WebAppContext.startContext(
>> WebAppContext.java:1366)
>>
>> at
>> org.eclipse.jetty.server.handler.ContextHandler.
>> doStart(ContextHandler.java:778)
>>
>> at
>> org.eclipse.jetty.servlet.ServletContextHandler.doStart(
>> ServletContextHandler.java:262)
>>
>> at
>> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:520)
>>
>> at
>> org.eclipse.jetty.util.component.AbstractLifeCycle.
>> start(AbstractLifeCycle.java:68)
>>
>> at
>> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(
>> StandardStarter.java:41)
>>
>> at
>> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)
>>
>> at
>> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(
>> DeploymentManager.java:499)
>>
>> at
>> org.eclipse.jetty.deploy.DeploymentManager.addApp(
>> DeploymentManager.java:147)
>>
>> at
>> org.eclipse.jetty.deploy.providers.ScanningAppProvider.
>> fileAdded(ScanningAppProvider.java:180)
>>
>> at
>> org.eclipse.jetty.deploy.providers.WebAppProvider.
>> fileAdded(WebAppProvider.java:458)
>>
>> at
>> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(
>> ScanningAppProvider.java:64)
>>
>> at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:610)
>>
>> at
>> org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:529)
>>
>> at org.eclipse.jetty.util.Scanner.scan(Scanner.java:392)
>>
>> at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:313)
>>
>> at
>> org.eclipse.jetty.util.component.AbstractLifeCycle.
>> start(AbstractLifeCycle.java:68)
>>
>> at
>> org.eclipse.jetty.deploy.providers.ScanningAppProvider.
>> doStart(ScanningAppProvider.java:150)
>>
>> at
>> org.eclipse.jetty.util.component.AbstractLifeCycle.
>> start(AbstractLifeCycle.java:68)
>>
>> at
>>

Re: Solr fails even ZK quorum has majority

2018-07-23 Thread Michael Braun

Per the exception, this looks like a network / DNS resolution issue,
independent of Solr and Zookeeper code:

Caused by: org.apache.solr.common.SolrException:
java.net.UnknownHostException: ditsearch001.es.com: Name or service not
known

Is this address actually resolvable at the time?

On Mon, Jul 23, 2018 at 3:46 PM, Susheel Kumar 
wrote:

> In usual circumstances when one Zookeeper goes down while others 2 are up,
> Solr continues to operate but when one of the ZK machine was not reachable
> with ping returning below results, Solr count't starts.  See stack trace
> below
>
> ping: cannot resolve ditsearch001.es.com: Unknown host
>
>
> Setup: Solr 6.6.2 and Zookeeper 3.4.10
>
> I had to remove this server name from the ZK_HOST list (solr.in.sh) in
> order to get Solr started. Ideally whatever issue is there as far as
> majority is there, Solr should get started.
>
> Has any one noticed this issue?
>
> Thnx
>
> 2018-07-23 15:30:47.218 INFO  (main) [   ] o.e.j.s.Server
> jetty-9.3.14.v20161028
>
> 2018-07-23 15:30:47.817 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter  ___
> _   Welcome to Apache Solr‚Ñ¢ version 6.6.2
>
> 2018-07-23 15:30:47.829 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter / __|
> ___| |_ _   Starting in cloud mode on port 8080
>
> 2018-07-23 15:30:47.830 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter \__
> \/ _ \ | '_|  Install dir: /opt/solr
>
> 2018-07-23 15:30:47.861 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> |___/\___/_|_|Start time: 2018-07-23T15:30:47.832Z
>
> 2018-07-23 15:30:47.863 INFO  (main) [   ] o.a.s.s.StartupLoggingUtils
> Property solr.log.muteconsole given. Muting ConsoleAppender named CONSOLE
>
> 2018-07-23 15:30:47.929 INFO  (main) [   ] o.a.s.c.SolrResourceLoader Using
> system property solr.solr.home: /app/solr/data
>
> 2018-07-23 15:30:48.037 ERROR (main) [   ] o.a.s.s.SolrDispatchFilter Could
> not start Solr. Check solr/home property and the logs
>
> 2018-07-23 15:30:48.235 ERROR (main) [   ] o.a.s.c.SolrCore
> null:org.apache.solr.common.SolrException: Error occurred while loading
> solr.xml from zookeeper
>
> at
> org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(
> SolrDispatchFilter.java:270)
>
> at
> org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(
> SolrDispatchFilter.java:242)
>
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(
> SolrDispatchFilter.java:173)
>
> at
> org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:137)
>
> at
> org.eclipse.jetty.servlet.ServletHandler.initialize(
> ServletHandler.java:873)
>
> at
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(
> ServletContextHandler.java:349)
>
> at
> org.eclipse.jetty.webapp.WebAppContext.startWebapp(
> WebAppContext.java:1404)
>
> at
> org.eclipse.jetty.webapp.WebAppContext.startContext(
> WebAppContext.java:1366)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.
> doStart(ContextHandler.java:778)
>
> at
> org.eclipse.jetty.servlet.ServletContextHandler.doStart(
> ServletContextHandler.java:262)
>
> at
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:520)
>
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.
> start(AbstractLifeCycle.java:68)
>
> at
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(
> StandardStarter.java:41)
>
> at
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)
>
> at
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(
> DeploymentManager.java:499)
>
> at
> org.eclipse.jetty.deploy.DeploymentManager.addApp(
> DeploymentManager.java:147)
>
> at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.
> fileAdded(ScanningAppProvider.java:180)
>
> at
> org.eclipse.jetty.deploy.providers.WebAppProvider.
> fileAdded(WebAppProvider.java:458)
>
> at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(
> ScanningAppProvider.java:64)
>
> at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:610)
>
> at
> org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:529)
>
> at org.eclipse.jetty.util.Scanner.scan(Scanner.java:392)
>
> at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:313)
>
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.
> start(AbstractLifeCycle.java:68)
>
> at
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.
> doStart(ScanningAppProvider.java:150)
>
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.
> start(AbstractLifeCycle.java:68)
>
> at
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(
> DeploymentManager.java:561)
>
> at
> org.eclipse.jetty.deploy.DeploymentManager.doStart(
> DeploymentManager.java:236)
>
> at
> org.eclipse.jetty.util.component.AbstractLifeCycle.
> start(AbstractLifeCycle.java:68)
>
>

Solr fails even ZK quorum has majority

2018-07-23 Thread Susheel Kumar

In usual circumstances when one Zookeeper goes down while others 2 are up,
Solr continues to operate but when one of the ZK machine was not reachable
with ping returning below results, Solr count't starts.  See stack trace
below

ping: cannot resolve ditsearch001.es.com: Unknown host


Setup: Solr 6.6.2 and Zookeeper 3.4.10

I had to remove this server name from the ZK_HOST list (solr.in.sh) in
order to get Solr started. Ideally whatever issue is there as far as
majority is there, Solr should get started.

Has any one noticed this issue?

Thnx

2018-07-23 15:30:47.218 INFO  (main) [   ] o.e.j.s.Server
jetty-9.3.14.v20161028

2018-07-23 15:30:47.817 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter  ___
_   Welcome to Apache Solr‚Ñ¢ version 6.6.2

2018-07-23 15:30:47.829 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter / __|
___| |_ _   Starting in cloud mode on port 8080

2018-07-23 15:30:47.830 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter \__
\/ _ \ | '_|  Install dir: /opt/solr

2018-07-23 15:30:47.861 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
|___/\___/_|_|Start time: 2018-07-23T15:30:47.832Z

2018-07-23 15:30:47.863 INFO  (main) [   ] o.a.s.s.StartupLoggingUtils
Property solr.log.muteconsole given. Muting ConsoleAppender named CONSOLE

2018-07-23 15:30:47.929 INFO  (main) [   ] o.a.s.c.SolrResourceLoader Using
system property solr.solr.home: /app/solr/data

2018-07-23 15:30:48.037 ERROR (main) [   ] o.a.s.s.SolrDispatchFilter Could
not start Solr. Check solr/home property and the logs

2018-07-23 15:30:48.235 ERROR (main) [   ] o.a.s.c.SolrCore
null:org.apache.solr.common.SolrException: Error occurred while loading
solr.xml from zookeeper

at
org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(SolrDispatchFilter.java:270)

at
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:242)

at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:173)

at
org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:137)

at
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:873)

at
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:349)

at
org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1404)

at
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1366)

at
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:778)

at
org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:262)

at
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:520)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)

at
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:41)

at
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:188)

at
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:499)

at
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:147)

at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:180)

at
org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:458)

at
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:64)

at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:610)

at
org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:529)

at org.eclipse.jetty.util.Scanner.scan(Scanner.java:392)

at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:313)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)

at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:150)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)

at
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:561)

at
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:236)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)

at
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)

at org.eclipse.jetty.server.Server.start(Server.java:422)

at
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)

at
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)

at org.eclipse.jetty.server.Server.doStart(Server.java:389)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)

at
org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1516)

at

Re: SolrCloud acceptable latency, when to use CDCR?

2018-07-23 Thread Erick Erickson

It Depends (tm).

There are several issues when communications are unreliable, basically
all having to do with timeouts.

> ZK not getting "keep alive" requests back in time and marking the node as down
> leaders not getting responses back in time from followers in time and putting 
> them into recovery.
> client timeouts because of slow connections

Plus, consider a single Solr query in a sharded environment. It has to:
> send a sub request to one replica of each shard
> get the sub request back
> sort the true top N
> request the actual doc from each replica in the first step
> send the final response to the client.

Or indexing. Let's say you get a doc in to replica N. Then
> the doc is forwarded to the leader for the appropriate shard
> the doc is sent to each replica in that shard
> the response comes back to the leader
> the response is ack'd back to the original node receiving the request
> the response is ack'd back to the client.

The point is that there's a _lot_ of communication between Solr nodes,
and if you have a slow
pipe connecting them overall latency is increased. But whether that
latency is acceptable for
your application only you can tell.

But be a little careful here. None of the above addresses a
"problematic network" in the
sense that a slow network is totally expected. CDCR is intended to
solve the problem of
"problematic" being defined as separate data centers connected by a network with
latency.

Best,
Erick

On Mon, Jul 23, 2018 at 3:01 AM, Pavel Micka  wrote:
> Hi,
>
> We are discussing advantages of SolrCloud Replication and Cross Data Center 
> Replication (CDCR). In CDCR docs, it is written that
> "The SolrCloud architecture is not particularly well suited for situations 
> where a single SolrCloud cluster consists of nodes in separated data clusters 
> connected by an expensive pipe".
>
> But we fail to find, what latency is acceptable for SolrCloud/ZK and when we 
> should start considering using CDCR (master-slave). And what would be the 
> issues if we install SolrCloud on problematic network?
>
> Thanks in advance,
>
> Pavel

Re: /state.json vs /clusterstate.json

2018-07-23 Thread Erick Erickson

Oh boy. These bits are awkward during the transition but I think you'll be OK.

bq. "but for some of them there is no entry state.json"

This is a bit concerning. Are the entries in clustersate.json for
valid replicas that are _not_ in the associated state.json?

There is a cluster property "legacyCloud". When true (the default
before 7.0), when Solr finds replicas laying around on disk it
reconstructs the znode from the information in the associated
core.properties file in the replica's directory. In, you guessed it,
clusterstate.json. So if you, say, shut down a Solr node with replicas
on it, then deleted the collection and then brought the Solr node
backup, those replicas would re-appear in clusterstate.json.

So if you have live replicas in clusterstate.json but _not_ in
state.json then somehow you have to get them to the right place. If
not, they can be safely deleted from clusterstate.json.

What you really want, though, is to not have anything in
clusterstate.json. So here's what I'd do:

> create a different ZK ensemble, maybe a single node on your local box. 
> Specifically _not_ connected to your prod system.
> See what happens if you issue the MIGRATESTATEFORMAT command to that isolated 
> ZK node. Does the result conform to your prod system? If so, you can run it 
> on our prod system.
> Once you're happy with the individual state.json files, go ahead and migrate 
> those to prod.
> set the legacyCloud property to false.

NOTES:
1> you need to have an empty clusterstate.json file, one that just
consists of {}.
2> you can use the "bin/solr zk" series of commands to overwrite
individual znodes. Or zkcli, whichever you find easiest.
3> I'd _really_ recommend backing things up first!
4> There are visual ZK node editor tools out there, if this gets
really complex it would probably be worth investing in.

Best,
Erick

On Mon, Jul 23, 2018 at 3:27 AM, Patrick Recchia
 wrote:
> From what I know until today, the status of a solr cluster used to be
> stored in a zk entry /clusterstate.json; but is now, from solr 5.0, stored
> within a sub-folder /collections//state.json.
>
> We are having issues with our cluster, and I have noticed today that:
> for most of the collections  there is a /state.json entry within
> /collections//state.json
>
> but for some of them there is no entry state.json.
> On the other hand, there is a /clusterstate.json; which I would have not
> expected.
>
> What is going on?
> Who decides where the state of a collection is written to?
> Can I force it somehow?
>
> Because, from what I can understand, we're facing the 'few hundreds
> collections' issue I've read about some time ago.
>
> Let me explain:
>
> Just a few figures:
> - we currently have 103 collections
> - most of them  have 40 shards and 2 replicas each
> Which brings to approx 800 replicas in total.
>
> Now, we had found references somewhere on the net saying that the 'number
> of collections' of a solr cluster should remain within the 'few hundreds'
> range.
> because of performance issue. Since each 'collection' would point to the
> same zk entry.
> Comment seemed to be bound to solr 4, though.
>
> But now, we have reached 800 nodes. Which shouldn't be a problem if they
> cluster in groups of 80 nodes at a time (1 collection).
> But is definitely an issue if they all point to a single zk node.
>
> Thanks already for any hint at where to look
>
> Patrick

Re: Question

2018-07-23 Thread Alexandre Rafalovitch

That depends on what you mean by "unstructured" and "handle".

If by "unstructured" you mean things like PDFs and MSWord - which are
structured under the covers, then yes. Solr ships with Apache Tika to
injest such documents (see shipped examples as well as Data Import
Handler example). E.g.
http://lucene.apache.org/solr/guide/7_4/uploading-data-with-solr-cell-using-apache-tika.html#uploading-data-with-solr-cell-using-apache-tika
and 
http://lucene.apache.org/solr/guide/7_4/uploading-structured-data-store-data-with-the-data-import-handler.html
 You do have to map what you extract to what you mean by "handle".

If you mean just long blob of text (e.g. whole book as a plain text
file), then we go straight to "handle" - what you want to find and how
you want to search for it.

So, think backwards from the search. What you need to find and then
what you have. Then come back with the question on how to connect the
dots in the middle.

Regards,
   Alex.

On 23 July 2018 at 07:02, Driss Khalil  wrote:
> Hi,
> I'm new to Solr and I just want to know if it's possible to handle
> Unstrcutured data in solr .If yes how can we do it ? Do we need it to
> combine it with something else?
>
>
>
>
>
> *Driss KHALIL*
>
> Responsable prospection & sponsoring, Forum GENI Entreprises.
>
> Elève ingénieur en Génie Logiciel, ENSIAS.
> GSM: (+212) 06 62 52 83 26
>
> [image: https://www.linkedin.com/in/driss-khalil-b3aab4151/]
>

Re: Can I use RegEx function?

2018-07-23 Thread Peter Sh

Right you are. I hadn't been known it during index-time

On Mon, Jul 23, 2018 at 3:43 PM Erik Hatcher  wrote:

> this is best done at index-time.   (it seems like you're trying to avoid
> doing that though)
>
>
>
> > On Jul 23, 2018, at 5:36 AM, Peter Sh  wrote:
> >
> > I want to be able to parse "KEY:VALUE" pairs from my text and have a
> facet
> > representing distribution of VALUES
> >
> > On Mon, Jul 23, 2018 at 12:25 PM Markus Jelsma <
> markus.jel...@openindex.io>
> > wrote:
> >
> >> Hello,
> >>
> >> Neither fl nor facet.field support functions, but facet.query is
> analogous
> >> to the latter. I do not understand what you need/want with fl and regex.
> >>
> >> Regards,
> >> Markus
> >>
> >>
> >>
> >> -Original message-
> >>> From:Peter Sh 
> >>> Sent: Monday 23rd July 2018 11:21
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: Can I use RegEx function?
> >>>
> >>> Can I use it in "fl" and  "facet.field" as a function
> >>>
> >>> On Mon, Jul 23, 2018 at 11:33 AM Markus Jelsma <
> >> markus.jel...@openindex.io>
> >>> wrote:
> >>>
>  Hello,
> 
>  The usual faceting works for all queries,
> facet.query=q:field:/[a-z]+$/
>  will probably work too, i would be really surprised if it didn't. Keep
> >> in
>  mind that my example doesn't work, the + needs to be URL encoded!
> 
>  Regards,
>  Markus
> 
> 
> 
>  -Original message-
> > From:Peter Sh 
> > Sent: Monday 23rd July 2018 10:26
> > To: solr-user@lucene.apache.org
> > Subject: Re: Can I use RegEx function?
> >
> > can it be used in facets?
> >
> > On Mon, Jul 23, 2018, 11:24 Markus Jelsma <
> >> markus.jel...@openindex.io>
> > wrote:
> >
> >> Hello,
> >>
> >> It is not really obvious in documentation, but the standard query
>  parser
> >> supports regular expressions. Encapsulate your regex with forward
>  slashes
> >> /, q=field:/[a-z]+$/ will work.
> >>
> >> Regards,
> >> Markus
> >>
> >>
> >>
> >> -Original message-
> >>> From:Peter Sh 
> >>> Sent: Monday 23rd July 2018 10:09
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Can I use RegEx function?
> >>>
> >>> I've got collection with a string or text field storing
> >> free-text.
>  I'd
> >> like
> >>> to use some RexEx function looking for patterns like "KEY:VALUE"
>  from the
> >>> text and use it for filtering and faceting.
> >>>
> >>
> >
> 
> >>>
> >>
>
>

Re: Solr 7 replication speed cap?

2018-07-23 Thread David Hastings

Actually this could be ignored, I think solr 5 used Mb in the admin
interface and solr 7 is using MB, correct?

On Mon, Jul 23, 2018 at 9:33 AM, David Hastings <
hastings.recurs...@gmail.com> wrote:

> Hey all, just set up a tradition solr slave to my indexing master
> alongside a solr 5 instance.  on solr 5 we were getting about 100 MB/sec
> over our interface, and it would divide accordingly for how many slaves
> were replicating, ie 50 MB each if two slaves were replicating 33 for three
> so on a so forth. But the solr 7 instance, on the exact same interface it
> seems its capping itself at 5.1 MB/sec, which obviously is an unacceptable
> speed.  Is there a new setting in the jetty server perhaps or some where
> else that is limiting the bandwidth use?
>
> Thanks, David
>

Re: Question

2018-07-23 Thread Andrea Gazzarini

Hi Driss,
I think the answer to the first question is yes, but I guess It doesn't
help you so much.
Second and third questions: "It depends", you should describe better your
contest, narrowing questions ad much as possibile ("how can web do It" is
definitely top much generic)

Best,
Andrea


Il lun 23 lug 2018, 15:18 Driss Khalil  ha scritto:

> Hi,
> I'm new to Solr and I just want to know if it's possible to handle
> Unstrcutured data in solr .If yes how can we do it ? Do we need it to
> combine it with something else?
>
>
>
>
>
> *Driss KHALIL*
>
> Responsable prospection & sponsoring, Forum GENI Entreprises.
>
> Elève ingénieur en Génie Logiciel, ENSIAS.
> GSM: (+212) 06 62 52 83 26
>
> [image: https://www.linkedin.com/in/driss-khalil-b3aab4151/]
> 
>

Solr 7 replication speed cap?

2018-07-23 Thread David Hastings

Hey all, just set up a tradition solr slave to my indexing master alongside
a solr 5 instance.  on solr 5 we were getting about 100 MB/sec over our
interface, and it would divide accordingly for how many slaves were
replicating, ie 50 MB each if two slaves were replicating 33 for three so
on a so forth. But the solr 7 instance, on the exact same interface it
seems its capping itself at 5.1 MB/sec, which obviously is an unacceptable
speed.  Is there a new setting in the jetty server perhaps or some where
else that is limiting the bandwidth use?

Thanks, David

Question

2018-07-23 Thread Driss Khalil

Hi,
I'm new to Solr and I just want to know if it's possible to handle
Unstrcutured data in solr .If yes how can we do it ? Do we need it to
combine it with something else?





*Driss KHALIL*

Responsable prospection & sponsoring, Forum GENI Entreprises.

Elève ingénieur en Génie Logiciel, ENSIAS.
GSM: (+212) 06 62 52 83 26

[image: https://www.linkedin.com/in/driss-khalil-b3aab4151/]

Re: Can I use RegEx function?

2018-07-23 Thread Erik Hatcher

this is best done at index-time.   (it seems like you're trying to avoid doing 
that though)



> On Jul 23, 2018, at 5:36 AM, Peter Sh  wrote:
> 
> I want to be able to parse "KEY:VALUE" pairs from my text and have a facet
> representing distribution of VALUES
> 
> On Mon, Jul 23, 2018 at 12:25 PM Markus Jelsma 
> wrote:
> 
>> Hello,
>> 
>> Neither fl nor facet.field support functions, but facet.query is analogous
>> to the latter. I do not understand what you need/want with fl and regex.
>> 
>> Regards,
>> Markus
>> 
>> 
>> 
>> -Original message-
>>> From:Peter Sh 
>>> Sent: Monday 23rd July 2018 11:21
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Can I use RegEx function?
>>> 
>>> Can I use it in "fl" and  "facet.field" as a function
>>> 
>>> On Mon, Jul 23, 2018 at 11:33 AM Markus Jelsma <
>> markus.jel...@openindex.io>
>>> wrote:
>>> 
 Hello,
 
 The usual faceting works for all queries, facet.query=q:field:/[a-z]+$/
 will probably work too, i would be really surprised if it didn't. Keep
>> in
 mind that my example doesn't work, the + needs to be URL encoded!
 
 Regards,
 Markus
 
 
 
 -Original message-
> From:Peter Sh 
> Sent: Monday 23rd July 2018 10:26
> To: solr-user@lucene.apache.org
> Subject: Re: Can I use RegEx function?
> 
> can it be used in facets?
> 
> On Mon, Jul 23, 2018, 11:24 Markus Jelsma <
>> markus.jel...@openindex.io>
> wrote:
> 
>> Hello,
>> 
>> It is not really obvious in documentation, but the standard query
 parser
>> supports regular expressions. Encapsulate your regex with forward
 slashes
>> /, q=field:/[a-z]+$/ will work.
>> 
>> Regards,
>> Markus
>> 
>> 
>> 
>> -Original message-
>>> From:Peter Sh 
>>> Sent: Monday 23rd July 2018 10:09
>>> To: solr-user@lucene.apache.org
>>> Subject: Can I use RegEx function?
>>> 
>>> I've got collection with a string or text field storing
>> free-text.
 I'd
>> like
>>> to use some RexEx function looking for patterns like "KEY:VALUE"
 from the
>>> text and use it for filtering and faceting.
>>> 
>> 
> 
 
>>> 
>>

/state.json vs /clusterstate.json

2018-07-23 Thread Patrick Recchia

>From what I know until today, the status of a solr cluster used to be
stored in a zk entry /clusterstate.json; but is now, from solr 5.0, stored
within a sub-folder /collections//state.json.

We are having issues with our cluster, and I have noticed today that:
for most of the collections  there is a /state.json entry within
/collections//state.json

but for some of them there is no entry state.json.
On the other hand, there is a /clusterstate.json; which I would have not
expected.

What is going on?
Who decides where the state of a collection is written to?
Can I force it somehow?

Because, from what I can understand, we're facing the 'few hundreds
collections' issue I've read about some time ago.

Let me explain:

Just a few figures:
- we currently have 103 collections
- most of them  have 40 shards and 2 replicas each
Which brings to approx 800 replicas in total.

Now, we had found references somewhere on the net saying that the 'number
of collections' of a solr cluster should remain within the 'few hundreds'
range.
because of performance issue. Since each 'collection' would point to the
same zk entry.
Comment seemed to be bound to solr 4, though.

But now, we have reached 800 nodes. Which shouldn't be a problem if they
cluster in groups of 80 nodes at a time (1 collection).
But is definitely an issue if they all point to a single zk node.

Thanks already for any hint at where to look

Patrick

SolrCloud acceptable latency, when to use CDCR?

2018-07-23 Thread Pavel Micka

Hi,

We are discussing advantages of SolrCloud Replication and Cross Data Center 
Replication (CDCR). In CDCR docs, it is written that
"The SolrCloud architecture is not particularly well suited for situations 
where a single SolrCloud cluster consists of nodes in separated data clusters 
connected by an expensive pipe".

But we fail to find, what latency is acceptable for SolrCloud/ZK and when we 
should start considering using CDCR (master-slave). And what would be the 
issues if we install SolrCloud on problematic network?

Thanks in advance,

Pavel

Re: Can I use RegEx function?

2018-07-23 Thread Peter Sh

I want to be able to parse "KEY:VALUE" pairs from my text and have a facet
representing distribution of VALUES

On Mon, Jul 23, 2018 at 12:25 PM Markus Jelsma 
wrote:

> Hello,
>
> Neither fl nor facet.field support functions, but facet.query is analogous
> to the latter. I do not understand what you need/want with fl and regex.
>
> Regards,
> Markus
>
>
>
> -Original message-
> > From:Peter Sh 
> > Sent: Monday 23rd July 2018 11:21
> > To: solr-user@lucene.apache.org
> > Subject: Re: Can I use RegEx function?
> >
> > Can I use it in "fl" and  "facet.field" as a function
> >
> > On Mon, Jul 23, 2018 at 11:33 AM Markus Jelsma <
> markus.jel...@openindex.io>
> > wrote:
> >
> > > Hello,
> > >
> > > The usual faceting works for all queries, facet.query=q:field:/[a-z]+$/
> > > will probably work too, i would be really surprised if it didn't. Keep
> in
> > > mind that my example doesn't work, the + needs to be URL encoded!
> > >
> > > Regards,
> > > Markus
> > >
> > >
> > >
> > > -Original message-
> > > > From:Peter Sh 
> > > > Sent: Monday 23rd July 2018 10:26
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Re: Can I use RegEx function?
> > > >
> > > > can it be used in facets?
> > > >
> > > > On Mon, Jul 23, 2018, 11:24 Markus Jelsma <
> markus.jel...@openindex.io>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > It is not really obvious in documentation, but the standard query
> > > parser
> > > > > supports regular expressions. Encapsulate your regex with forward
> > > slashes
> > > > > /, q=field:/[a-z]+$/ will work.
> > > > >
> > > > > Regards,
> > > > > Markus
> > > > >
> > > > >
> > > > >
> > > > > -Original message-
> > > > > > From:Peter Sh 
> > > > > > Sent: Monday 23rd July 2018 10:09
> > > > > > To: solr-user@lucene.apache.org
> > > > > > Subject: Can I use RegEx function?
> > > > > >
> > > > > > I've got collection with a string or text field storing
> free-text.
> > > I'd
> > > > > like
> > > > > > to use some RexEx function looking for patterns like "KEY:VALUE"
> > > from the
> > > > > > text and use it for filtering and faceting.
> > > > > >
> > > > >
> > > >
> > >
> >
>

RE: Can I use RegEx function?

2018-07-23 Thread Markus Jelsma

Hello,

Neither fl nor facet.field support functions, but facet.query is analogous to 
the latter. I do not understand what you need/want with fl and regex.

Regards,
Markus

 
 
-Original message-
> From:Peter Sh 
> Sent: Monday 23rd July 2018 11:21
> To: solr-user@lucene.apache.org
> Subject: Re: Can I use RegEx function?
> 
> Can I use it in "fl" and  "facet.field" as a function
> 
> On Mon, Jul 23, 2018 at 11:33 AM Markus Jelsma 
> wrote:
> 
> > Hello,
> >
> > The usual faceting works for all queries, facet.query=q:field:/[a-z]+$/
> > will probably work too, i would be really surprised if it didn't. Keep in
> > mind that my example doesn't work, the + needs to be URL encoded!
> >
> > Regards,
> > Markus
> >
> >
> >
> > -Original message-
> > > From:Peter Sh 
> > > Sent: Monday 23rd July 2018 10:26
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Can I use RegEx function?
> > >
> > > can it be used in facets?
> > >
> > > On Mon, Jul 23, 2018, 11:24 Markus Jelsma 
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > It is not really obvious in documentation, but the standard query
> > parser
> > > > supports regular expressions. Encapsulate your regex with forward
> > slashes
> > > > /, q=field:/[a-z]+$/ will work.
> > > >
> > > > Regards,
> > > > Markus
> > > >
> > > >
> > > >
> > > > -Original message-
> > > > > From:Peter Sh 
> > > > > Sent: Monday 23rd July 2018 10:09
> > > > > To: solr-user@lucene.apache.org
> > > > > Subject: Can I use RegEx function?
> > > > >
> > > > > I've got collection with a string or text field storing free-text.
> > I'd
> > > > like
> > > > > to use some RexEx function looking for patterns like "KEY:VALUE"
> > from the
> > > > > text and use it for filtering and faceting.
> > > > >
> > > >
> > >
> >
>

Re: Can I use RegEx function?

2018-07-23 Thread Peter Sh

Can I use it in "fl" and  "facet.field" as a function

On Mon, Jul 23, 2018 at 11:33 AM Markus Jelsma 
wrote:

> Hello,
>
> The usual faceting works for all queries, facet.query=q:field:/[a-z]+$/
> will probably work too, i would be really surprised if it didn't. Keep in
> mind that my example doesn't work, the + needs to be URL encoded!
>
> Regards,
> Markus
>
>
>
> -Original message-
> > From:Peter Sh 
> > Sent: Monday 23rd July 2018 10:26
> > To: solr-user@lucene.apache.org
> > Subject: Re: Can I use RegEx function?
> >
> > can it be used in facets?
> >
> > On Mon, Jul 23, 2018, 11:24 Markus Jelsma 
> > wrote:
> >
> > > Hello,
> > >
> > > It is not really obvious in documentation, but the standard query
> parser
> > > supports regular expressions. Encapsulate your regex with forward
> slashes
> > > /, q=field:/[a-z]+$/ will work.
> > >
> > > Regards,
> > > Markus
> > >
> > >
> > >
> > > -Original message-
> > > > From:Peter Sh 
> > > > Sent: Monday 23rd July 2018 10:09
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Can I use RegEx function?
> > > >
> > > > I've got collection with a string or text field storing free-text.
> I'd
> > > like
> > > > to use some RexEx function looking for patterns like "KEY:VALUE"
> from the
> > > > text and use it for filtering and faceting.
> > > >
> > >
> >
>

RE: Can I use RegEx function?

2018-07-23 Thread Markus Jelsma

Hello,

The usual faceting works for all queries, facet.query=q:field:/[a-z]+$/ will 
probably work too, i would be really surprised if it didn't. Keep in mind that 
my example doesn't work, the + needs to be URL encoded!

Regards,
Markus

 
 
-Original message-
> From:Peter Sh 
> Sent: Monday 23rd July 2018 10:26
> To: solr-user@lucene.apache.org
> Subject: Re: Can I use RegEx function?
> 
> can it be used in facets?
> 
> On Mon, Jul 23, 2018, 11:24 Markus Jelsma 
> wrote:
> 
> > Hello,
> >
> > It is not really obvious in documentation, but the standard query parser
> > supports regular expressions. Encapsulate your regex with forward slashes
> > /, q=field:/[a-z]+$/ will work.
> >
> > Regards,
> > Markus
> >
> >
> >
> > -Original message-
> > > From:Peter Sh 
> > > Sent: Monday 23rd July 2018 10:09
> > > To: solr-user@lucene.apache.org
> > > Subject: Can I use RegEx function?
> > >
> > > I've got collection with a string or text field storing free-text. I'd
> > like
> > > to use some RexEx function looking for patterns like "KEY:VALUE" from the
> > > text and use it for filtering and faceting.
> > >
> >
>

Re: Can I use RegEx function?

2018-07-23 Thread Peter Sh

can it be used in facets?

On Mon, Jul 23, 2018, 11:24 Markus Jelsma 
wrote:

> Hello,
>
> It is not really obvious in documentation, but the standard query parser
> supports regular expressions. Encapsulate your regex with forward slashes
> /, q=field:/[a-z]+$/ will work.
>
> Regards,
> Markus
>
>
>
> -Original message-
> > From:Peter Sh 
> > Sent: Monday 23rd July 2018 10:09
> > To: solr-user@lucene.apache.org
> > Subject: Can I use RegEx function?
> >
> > I've got collection with a string or text field storing free-text. I'd
> like
> > to use some RexEx function looking for patterns like "KEY:VALUE" from the
> > text and use it for filtering and faceting.
> >
>

RE: Can I use RegEx function?

2018-07-23 Thread Markus Jelsma

Hello,

It is not really obvious in documentation, but the standard query parser 
supports regular expressions. Encapsulate your regex with forward slashes /, 
q=field:/[a-z]+$/ will work.

Regards,
Markus

 
 
-Original message-
> From:Peter Sh 
> Sent: Monday 23rd July 2018 10:09
> To: solr-user@lucene.apache.org
> Subject: Can I use RegEx function?
> 
> I've got collection with a string or text field storing free-text. I'd like
> to use some RexEx function looking for patterns like "KEY:VALUE" from the
> text and use it for filtering and faceting.
>

Can I use RegEx function?

2018-07-23 Thread Peter Sh

I've got collection with a string or text field storing free-text. I'd like
to use some RexEx function looking for patterns like "KEY:VALUE" from the
text and use it for filtering and faceting.

Re: Solr [subquery] document transformer

2018-07-23 Thread Mikhail Khludnev

Hello, Dwane.

[subquery] is made unaware of authentication. You can create a jira ticket
for this feature.

On Mon, Jul 23, 2018 at 5:42 AM Dwane Hall  wrote:

> Good afternoon knowledgeable solr community.  I’m experiencing problems
> using a document transformer across a multiple shard collection and am
> wondering if anyone would please be able to assist or provide some guidance?
>
>
>
> The document transformer query below works nicely until I split the
> collection into multiple shards and then I receive what appears to be an
> authentication issue on the subquery.
>
>
>
>
>
> My query configuration (the original query returns a document with a field
> link to another ‘parent’ document)
>
> "parent.q":"{!edismax qf=CHILD_ID v=$row.PARENT_ID _route_=PARENT_ID!}",
>
> "parent.fl":"*",
>
> "parent.rows":1,
>
> "fl":"…other fields to display, parent:[subquery]",
>
>
>
> Environment:
>
> SolrCloud (7.3.1)
>
> Https
>
> Rules based authentication provider
>
>
>
> Any advice would be appreciated.
>
>
>
> DH
>
>
>
> 2018-07-23 11:43:06,445 5471250 ERROR : [c:my_collection s:shard1
> r:core_node3 x: my_collection_shard1_replica_n1]
> org.apache.solr.common.SolrException :
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at
> https://serverName:9021/solr/my_collection_shard3_replica_n4: Expected
> mime type application/octet-stream but got text/html. 
>
> 
>
> 
>
> Error 401 require authentication
>
> 
>
> HTTP ERROR 401
>
> Problem accessing /solr/my_collection_shard3_replica_n4/select. Reason:
>
> require authentication
>
> 
>
> 
>
>
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:607)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
>
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
>
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
>
>
> https://serverName:9021/solr/my_collection_shard1_replica_n1: parsing
> error
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:616)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
>
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
>
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
> Caused by: org.apache.solr.common.SolrException: parsing error
>
> at
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:52)
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:614)
>
> ... 12 more
>
> Caused by: java.io.EOFException
>
> at
> org.apache.solr.common.util.FastInputStream.readByte(FastInputStream.java:207)
>
> at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:255)
>
> at
> org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747)
>
> at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272)
>
> at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>
> at
>

Anthill Inside and The Fifth Elephant Bengaluru India 2018 Edition

Re: Solr fails even ZK quorum has majority

How to use tika-OCR in data import handler?

Re: Solr fails even ZK quorum has majority

Re: Solr fails even ZK quorum has majority

Re: Solr fails even ZK quorum has majority

Solr fails even ZK quorum has majority

Re: SolrCloud acceptable latency, when to use CDCR?

Re: /state.json vs /clusterstate.json

Re: Question

Re: Can I use RegEx function?

Re: Solr 7 replication speed cap?

Re: Question

Solr 7 replication speed cap?

Question

Re: Can I use RegEx function?

/state.json vs /clusterstate.json

SolrCloud acceptable latency, when to use CDCR?

Re: Can I use RegEx function?

RE: Can I use RegEx function?

Re: Can I use RegEx function?

RE: Can I use RegEx function?

Re: Can I use RegEx function?

RE: Can I use RegEx function?

Can I use RegEx function?

Re: Solr [subquery] document transformer

26 matches

Site Navigation

Mail list logo

Footer information