Re: Solr with HDFS configuration example running in production/dev

2020-10-29 Thread Gézapeti
t; > at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255) > > at sun.nio.fs.AbstractPath.resolve(AbstractPath.java:53) > > at org.apache.solr.core.SolrCore.initUpdateLogDir(SolrCore.java:1380) > > at org.apache.solr.core.SolrCore.(SolrCore.java:958) > >

Re: Solr with HDFS configuration example running in production/dev

2020-08-26 Thread Prashant Jyoti
>> > >> > Are you also aware of what feature we are moving towards instead of >> HDFS? >> > Will you be able to help me with the error that I'm running into? >> > >> > Thanks in advance! >> > >> > >> > On Wed, 19 A

Re: Solr with HDFS configuration example running in production/dev

2020-08-24 Thread Joe Obernberger
will disappear. >> >> On Wed, Aug 19, 2020 at 7:52 AM Prashant Jyoti mailto:jtprash...@gmail.com>> >> wrote: >> >>> Hi all, >>> Hope you are healthy and safe. >>> >>> Need some help with HDF

Re: Solr with HDFS configuration example running in production/dev

2020-08-20 Thread Aroop Ganguly
HDFS will still be there, just NOT on the core package, but as a plug-in or contrib. > On Aug 20, 2020, at 11:07 AM, Aroop Ganguly wrote: > > HDFS will still be there, just on the core package, but as a plug-in or > contrib.

Re: Solr with HDFS configuration example running in production/dev

2020-08-20 Thread Aroop Ganguly
Not sure you want to >> continue configuration if support will disappear. >> >> On Wed, Aug 19, 2020 at 7:52 AM Prashant Jyoti >> wrote: >> >>> Hi all, >>> Hope you are healthy and safe. >>> >>> Need some help with HDFS conf

Re: Solr with HDFS configuration example running in production/dev

2020-08-20 Thread Andrew MacKay
t; > > at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77) > > > at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94) > > > at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255) > > > at sun.nio.fs.AbstractPath.resolve(AbstractPath.java

Re: Solr with HDFS configuration example running in production/dev

2020-08-20 Thread Mauro Asprea
Andrew. Even I read about that. But there's a use case for > > > which we want to configure the said case. > > > > > > Are you also aware of what feature we are moving towards instead of > HDFS? > > > Will you be able to help me with the error that I'm ru

Re: Solr with HDFS configuration example running in production/dev

2020-08-20 Thread Andy Hind
Hi I would not go down this road. What is the use case? Is this really the solution? Go read all the relevant docs and configuration provided by Cloudera/HortonWorks and everything else related to SOLR and HDFS. I am not inclined to help you down a road you do not want to travel

Re: Solr with HDFS configuration example running in production/dev

2020-08-20 Thread Prashant Jyoti
will disappear. > >> > >> On Wed, Aug 19, 2020 at 7:52 AM Prashant Jyoti > >> wrote: > >> > >>> Hi all, > >>> Hope you are healthy and safe. > >>> > >>> Need some help with HDFS configuration. > >>> > &

Re: Solr with HDFS configuration example running in production/dev

2020-08-19 Thread Joe Obernberger
AM Prashant Jyoti wrote: Hi all, Hope you are healthy and safe. Need some help with HDFS configuration. Could anybody of you share an example of the configuration with which you are running Solr with HDFS in any of your production/dev environments? I am interested in the parts of SolrConfig.xml

Re: Solr with HDFS configuration example running in production/dev

2020-08-19 Thread Prashant Jyoti
d safe. > > > > Need some help with HDFS configuration. > > > > Could anybody of you share an example of the configuration with which you > > are running Solr with HDFS in any of your production/dev environments? > > I am interested in the parts of SolrConf

Re: Solr with HDFS configuration example running in production/dev

2020-08-19 Thread Andrew MacKay
ybody of you share an example of the configuration with which you > are running Solr with HDFS in any of your production/dev environments? > I am interested in the parts of SolrConfig.xml / Solr.in.cmd/sh which you > may have modified. Obviously with the security parts obfuscated.

Solr with HDFS configuration example running in production/dev

2020-08-19 Thread Prashant Jyoti
Hi all, Hope you are healthy and safe. Need some help with HDFS configuration. Could anybody of you share an example of the configuration with which you are running Solr with HDFS in any of your production/dev environments? I am interested in the parts of SolrConfig.xml / Solr.in.cmd/sh which

Solr with HDFS

2020-08-17 Thread Prashant Jyoti
Hi, I am trying to get Solr running with HDFS but getting the attached exception in logs when trying to create a collection. I have attached the relevant portions of solrconfig.xml and solr.in.cmd that I have modified. Could anybody point me in the right direction? What might I be doing wrong? Any

Re: Solr on HDFS

2019-08-02 Thread Kevin Risden
many machines. HDFS > makes this easy. > > -Joe > > On 8/2/2019 9:10 AM, lstusr 5u93n4 wrote: > > Hi Joe, > > > > We fought with Solr on HDFS for quite some time, and faced similar issues > > as you're seeing. (See this thread, for example:" > > &

Re: Solr on HDFS

2019-08-02 Thread Joe Obernberger
one large file system to manage instead of lots of individual file systems across many machines.  HDFS makes this easy. -Joe On 8/2/2019 9:10 AM, lstusr 5u93n4 wrote: Hi Joe, We fought with Solr on HDFS for quite some time, and faced similar issues as you're seeing. (See this thread

Re: Solr on HDFS

2019-08-02 Thread lstusr 5u93n4
Hi Joe, We fought with Solr on HDFS for quite some time, and faced similar issues as you're seeing. (See this thread, for example:" http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201812.mbox/%3cCABd9LjTeacXpy3FFjFBkzMq6vhgu7Ptyh96+w-KC2p=-rqk...@mail.gmail.com%3e ) The Solr

Re: Solr on HDFS

2019-08-02 Thread Joe Obernberger
, recreate and index the affected collection, while you work your other isues. On Aug 1, 2019, at 16:40, Joe Obernberger wrote: Been using Solr on HDFS for a while now, and I'm seeing an issue with redundancy/reliability. If a server goes down, when it comes back up, it will never recover

Re: Solr on HDFS

2019-08-01 Thread Angie Rabelero
collection, while you work your other isues. On Aug 1, 2019, at 16:40, Joe Obernberger wrote: Been using Solr on HDFS for a while now, and I'm seeing an issue with redundancy/reliability. If a server goes down, when it comes back up, it will never recover because of the lock files in HDFS. That solr

Solr on HDFS

2019-08-01 Thread Joe Obernberger
Been using Solr on HDFS for a while now, and I'm seeing an issue with redundancy/reliability.  If a server goes down, when it comes back up, it will never recover because of the lock files in HDFS. That solr node needs to be brought down manually, the lock files deleted, and then brought back

Re: solr cloud - hdfs folder structure best practice

2018-11-02 Thread lstusr 5u93n4
e are issues with autoAddReplicas or other types of failovers > if there are different home folders. > > I've run Solr on HDFS with the same basic configs as listed here: > > https://risdenk.github.io/2018/10/23/apache-solr-running-on-apache-hadoop-hdfs.html > > Kevin Risden > > > On

Re: solr cloud - hdfs folder structure best practice

2018-11-02 Thread Kevin Risden
if there are different home folders. I've run Solr on HDFS with the same basic configs as listed here: https://risdenk.github.io/2018/10/23/apache-solr-running-on-apache-hadoop-hdfs.html Kevin Risden On Fri, Nov 2, 2018 at 1:19 PM lstusr 5u93n4 wrote: > Hi All, > > Here's a question tha

solr cloud - hdfs folder structure best practice

2018-11-02 Thread lstusr 5u93n4
Hi All, Here's a question that I can't find an answer to in the documentation: When configuring solr cloud with HDFS, is it best to: a) provide a unique hdfs folder for each solr cloud instance or b) provide the same hdfs folder to all solr cloud instances. So for example, if I have two

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-27 Thread zhenyuan wei
@Shawn Heisey Yeah, delete "write.lock" files manually is ok finally。 @Walter Underwood Have some performace evaluation about Solr on HDFS vs LocalFS recently? Shawn Heisey 于2018年8月28日周二 上午4:10写道: > On 8/26/2018 7:47 PM, zhenyuan wei wrote: > > I found an exceptio

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-27 Thread Shawn Heisey
On 8/26/2018 7:47 PM, zhenyuan wei wrote: I found an exception when running Solr on HDFS。The detail is: Running solr on HDFS,and update doc was running always, then,kill -9 solr JVM or reboot linux os/shutdown linux os,then restart all. If you use "kill -9" to stop a Sol

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-27 Thread Walter Underwood
PM zhenyuan wei wrote: >>> >>> Hi all, >>>I found an exception when running Solr on HDFS。The detail is: >>> Running solr on HDFS,and update doc was running always, >>> then,kill -9 solr JVM or reboot linux os/shutdown linux os,then restart >> a

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-27 Thread zhenyuan wei
l, > > I found an exception when running Solr on HDFS。The detail is: > > Running solr on HDFS,and update doc was running always, > > then,kill -9 solr JVM or reboot linux os/shutdown linux os,then restart > all. > > The exception appears like: > > > >

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-26 Thread Erick Erickson
Because HDFS doesn't follow the file semantics that Solr expects. There's quite a bit of background here: https://issues.apache.org/jira/browse/SOLR-8335 Best, Erick On Sun, Aug 26, 2018 at 6:47 PM zhenyuan wei wrote: > > Hi all, > I found an exception when running Solr on HDFS。T

An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-26 Thread zhenyuan wei
Hi all, I found an exception when running Solr on HDFS。The detail is: Running solr on HDFS,and update doc was running always, then,kill -9 solr JVM or reboot linux os/shutdown linux os,then restart all. The exception appears like: 2018-08-26 22:23:12.529 ERROR (coreContainerWorkExecutor-2

Re: Solr 7 + HDFS issue

2018-06-13 Thread Shawn Heisey
On 6/12/2018 10:14 PM, Joe Obernberger wrote: Thank you Shawn.  It looks like it is being applied.  This could be some sort of chain reaction where: Drive or server fails.  HDFS starts to replicate blocks which causes network congestion.  Solr7 can't talk, so initiates a replication process

Re: Solr 7 + HDFS issue

2018-06-12 Thread Joe Obernberger
Thank you Shawn.  It looks like it is being applied.  This could be some sort of chain reaction where: Drive or server fails.  HDFS starts to replicate blocks which causes network congestion.  Solr7 can't talk, so initiates a replication process which causes more network congestionwhich

Re: Solr 7 + HDFS issue

2018-06-12 Thread Shawn Heisey
On 6/11/2018 9:46 AM, Joe Obernberger wrote: > We are seeing an issue on our Solr Cloud 7.3.1 cluster where > replication starts and pegs network interfaces so aggressively that > other tasks cannot talk.  We will see it peg a bonded 2GB interfaces.  > In some cases the replication fails over and

Solr 7 + HDFS issue

2018-06-11 Thread Joe Obernberger
We are seeing an issue on our Solr Cloud 7.3.1 cluster where replication starts and pegs network interfaces so aggressively that other tasks cannot talk.  We will see it peg a bonded 2GB interfaces.  In some cases the replication fails over and over until it finally succeeds and the replica

Re: Running Solr on HDFS - Disk space

2018-06-07 Thread Hendrik Haddorp
The only option should be to configure Solr to just have a replication factor of 1 or HDFS to have no replication. I would go for the middle and configure both to use a factor of 2. This way a single failure in HDFS and Solr is not a problem. While in 1/3 or 3/1 option a single server error

Re: Running Solr on HDFS - Disk space

2018-06-07 Thread Shawn Heisey
to configure HDFS or Solr such that only three copies are maintained overall? Yes, that is exactly what happens. SolrCloud replication assumes that each of its replicas is a completely independent index.  I am not aware of anything in Solr's HDFS support that can use one HDFS index directory

Running Solr on HDFS - Disk space

2018-06-07 Thread Greenhorn Techie
Hi, As HDFS has got its own replication mechanism, with a HDFS replication factor of 3, and then SolrCloud replication factor of 3, does that mean each document will probably have around 9 copies replicated underneath of HDFS? If so, is there a way to configure HDFS or Solr such that only three

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Erick Erickson
bq: We also had an HDFS setup already so it looked like a good option to not loos data. Earlier we had a few cases where we lost the machines so HDFS looked safer for that. right, that's one of the places where using HDFS to back Solr makes a lot of sense. The other approach is to just have

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Hendrik Haddorp
cted to us using Docker and Marathon/Mesos. With HDFS the data is in a shared file system and thus it is possible to move the replica to a different instance on a a different host. regards, Hendrik On 22.11.2017 14:59, Greenhorn Techie wrote: Hi, Good Afternoon!! While the discussion around i

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Erick Erickson
; regards, >> Hendrik >> >> On 22.11.2017 14:59, Greenhorn Techie wrote: >> > Hi, >> > >> > Good Afternoon!! >> > >> > While the discussion around issues related to "Solr on HDFS" is live, I >> > would like to understan

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Greenhorn Techie
gt; thus it is possible to move the replica to a different instance on a a > different host. > > regards, > Hendrik > > On 22.11.2017 14:59, Greenhorn Techie wrote: > > Hi, > > > > Good Afternoon!! > > > > While the discussion around issues related t

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Hendrik Haddorp
. regards, Hendrik On 22.11.2017 14:59, Greenhorn Techie wrote: Hi, Good Afternoon!! While the discussion around issues related to "Solr on HDFS" is live, I would like to understand if anyone has done any performance benchmarking for both Solr indexing and search between HDFS vs local f

Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Greenhorn Techie
Hi, Good Afternoon!! While the discussion around issues related to "Solr on HDFS" is live, I would like to understand if anyone has done any performance benchmarking for both Solr indexing and search between HDFS vs local file system. Also, from experience, what would the commu

Re: Solr with HDFS on AWS S3 - Server restart fails to load the core

2017-04-07 Thread Amarnath palavalli
ossibly > using > > > the Core Admin Reload would do it (https://cwiki.apache.org/ > > > confluence/display/solr/CoreAdmin+API#CoreAdminAPI-RELOAD). > > > > > > Best of luck, > > > > > > Trey > > > > > > From: Amarnath palavalli [

Re: Solr with HDFS on AWS S3 - Server restart fails to load the core

2017-04-07 Thread Kevin Risden
> > the Core Admin Reload would do it (https://cwiki.apache.org/ > > confluence/display/solr/CoreAdmin+API#CoreAdminAPI-RELOAD). > > > > Best of luck, > > > > Trey > > > > From: Amarnath palavalli [mailto:pamarn...@gmail.com] > > Sent: Frida

Re: Solr with HDFS on AWS S3 - Server restart fails to load the core

2017-04-07 Thread Amarnath palavalli
org/ > confluence/display/solr/CoreAdmin+API#CoreAdminAPI-RELOAD). > > Best of luck, > > Trey > > From: Amarnath palavalli [mailto:pamarn...@gmail.com] > Sent: Friday, April 07, 2017 3:20 PM > To: solr-user@lucene.apache.org > Subject: Solr with HDFS on AWS S3 - Server res

RE: Solr with HDFS on AWS S3 - Server restart fails to load the core

2017-04-07 Thread Cahill, Trey
://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-RELOAD). Best of luck, Trey From: Amarnath palavalli [mailto:pamarn...@gmail.com] Sent: Friday, April 07, 2017 3:20 PM To: solr-user@lucene.apache.org Subject: Solr with HDFS on AWS S3 - Server restart fails to load the core

Solr with HDFS on AWS S3 - Server restart fails to load the core

2017-04-07 Thread Amarnath palavalli
Hello, I configured Solr to use HDFS, which in turn configured to use S3N. I used the information from this issue to configure: *https://issues.apache.org/jira/browse/SOLR-9952 <https://issues.apache.org/jira/browse/SOLR-9952>* Here is the command I have used to start the Solr with HDFS:

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-22 Thread Hendrik Haddorp
l is missing to set the shard id I guess or some code is checking wrongly. I know very little about how SolrCloud interacts with HDFS, so although I'm reasonably certain about what comes below, I could be wrong. I have not ever heard of SolrCloud being able to automatically take over an exi

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-22 Thread Erick Erickson
unfortunately can easily >>>> lead >>>> to SOLR-8335, which hopefully will be fixed by SOLR-8169. A manual >>>> cleanup >>>> is however also easily done but seems to require a node restart to take >>>> effect. But I'm also only recentl

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Hendrik Haddorp
nteracts with HDFS, so although I'm reasonably certain about what comes below, I could be wrong. I have not ever heard of SolrCloud being able to automatically take over an existing index directory when it creates a replica, or even share index directories unless the admin fools it into doing so witho

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Erick Erickson
t; Given that the data is on HDFS it shouldn't matter if any active >>>> replica is left as the data does not need to get transferred from >>>> another instance but the new core will just take over the existing >>>> data. Thus a replication factor of 1 should also wo

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Hendrik Haddorp
d of SolrCloud being able to automatically take over an existing index directory when it creates a replica, or even share index directories unless the admin fools it into doing so without its knowledge. Sharing an index directory for replicas with SolrCloud would NOT work correctly. Solr must be able to

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Hendrik Haddorp
ly certain about what comes below, I could be wrong. I have not ever heard of SolrCloud being able to automatically take over an existing index directory when it creates a replica, or even share index directories unless the admin fools it into doing so without its knowledge. Sharing an index direct

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Shawn Heisey
irectory for replicas with SolrCloud would NOT work correctly. Solr must be able to update all replicas independently, which means that each of them will lock its index directory and write to it. It is my understanding (from reading messages on mailing lists) that when using HDFS, Solr replicas are all

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Hendrik Haddorp
Hi, I'm seeing the same issue on Solr 6.3 using HDFS and a replication factor of 3, even though I believe a replication factor of 1 should work the same. When I stop a Solr instance this is detected and Solr actually wants to create a replica on a different instance. The command for that does

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-13 Thread Shawn Heisey
On 1/13/2017 5:46 PM, Chetas Joshi wrote: > One of the things I have observed is: if I use the collection API to > create a replica for that shard, it does not complain about the config > which has been set to ReplicationFactor=1. If replication factor was > the issue as suggested by Shawn,

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-13 Thread Chetas Joshi
Erick > > On Thu, Jan 12, 2017 at 8:42 AM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 1/11/2017 7:14 PM, Chetas Joshi wrote: > >> This is what I understand about how Solr works on HDFS. Please correct > me > >> if I am wrong. > >> > >>

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-12 Thread Erick Erickson
tas Joshi wrote: >> This is what I understand about how Solr works on HDFS. Please correct me >> if I am wrong. >> >> Although solr shard replication Factor = 1, HDFS default replication = 3. >> When the node goes down, the solr server running on that node goes down and &g

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-12 Thread Shawn Heisey
On 1/11/2017 7:14 PM, Chetas Joshi wrote: > This is what I understand about how Solr works on HDFS. Please correct me > if I am wrong. > > Although solr shard replication Factor = 1, HDFS default replication = 3. > When the node goes down, the solr server running on that node goes d

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-11 Thread Shawn Heisey
On 1/11/2017 1:47 PM, Chetas Joshi wrote: > I have deployed a SolrCloud (solr 5.5.0) on hdfs using cloudera 5.4.7. The > cloud has 86 nodes. > > This is my config for the collection > > numShards=80 > ReplicationFactor=1 > maxShardsPerNode=1 > autoAddReplica=true &g

Solr on HDFS: AutoAddReplica does not add a replica

2017-01-11 Thread Chetas Joshi
Hello, I have deployed a SolrCloud (solr 5.5.0) on hdfs using cloudera 5.4.7. The cloud has 86 nodes. This is my config for the collection numShards=80 ReplicationFactor=1 maxShardsPerNode=1 autoAddReplica=true I recently decommissioned a node to resolve some disk issues. The shard

Re: Solr on HDFS: Streaming API performance tuning

2016-12-19 Thread Joel Bernstein
I took another look at the stack trace and I'm pretty sure the issue is with NULL values in one of the sort fields. The null pointer is occurring during the comparison of sort values. See line 85 of:

Re: Solr on HDFS: Streaming API performance tuning

2016-12-19 Thread Chetas Joshi
Hi Joel, I don't have any solr documents that have NULL values for the sort fields I use in my queries. Thanks! On Sun, Dec 18, 2016 at 12:56 PM, Joel Bernstein wrote: > Ok, based on the stack trace I suspect one of your sort fields has NULL > values, which in the 5x

Re: Solr on HDFS: Streaming API performance tuning

2016-12-18 Thread Joel Bernstein
Ok, based on the stack trace I suspect one of your sort fields has NULL values, which in the 5x branch could produce null pointers if a segment had no values for a sort field. This is also fixed in the Solr 6x branch. Joel Bernstein http://joelsolr.blogspot.com/ On Sat, Dec 17, 2016 at 2:44 PM,

Re: Solr on HDFS: Streaming API performance tuning

2016-12-17 Thread Chetas Joshi
Here is the stack trace. java.lang.NullPointerException at org.apache.solr.client.solrj.io.comp.FieldComparator$2.compare(FieldComparator.java:85) at org.apache.solr.client.solrj.io.comp.FieldComparator.compare(FieldComparator.java:92) at

Re: Solr on HDFS: Streaming API performance tuning

2016-12-16 Thread Reth RM
If you could provide the json parse exception stack trace, it might help to predict issue there. On Fri, Dec 16, 2016 at 5:52 PM, Chetas Joshi wrote: > Hi Joel, > > The only NON alpha-numeric characters I have in my data are '+' and '/'. I > don't have any backslashes.

Re: Solr on HDFS: Streaming API performance tuning

2016-12-16 Thread Chetas Joshi
Hi Joel, The only NON alpha-numeric characters I have in my data are '+' and '/'. I don't have any backslashes. If the special characters was the issue, I should get the JSON parsing exceptions every time irrespective of the index size and irrespective of the available memory on the machine.

Re: Solr on HDFS: Streaming API performance tuning

2016-12-16 Thread Joel Bernstein
The Streaming API may have been throwing exceptions because the JSON special characters were not escaped. This was fixed in Solr 6.0. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Dec 16, 2016 at 4:34 PM, Chetas Joshi wrote: > Hello, > > I am running Solr

Solr on HDFS: Streaming API performance tuning

2016-12-16 Thread Chetas Joshi
Hello, I am running Solr 5.5.0. It is a solrCloud of 50 nodes and I have the following config for all the collections. maxShardsperNode: 1 replicationFactor: 1 I was using Streaming API to get back results from Solr. It worked fine for a while until the index data size reached beyond 40 GB per

Re: Solr on HDFS: increase in query time with increase in data

2016-12-16 Thread Shawn Heisey
On 12/16/2016 11:58 AM, Chetas Joshi wrote: > How different the index data caching mechanism is for the Streaming > API from the cursor approach? Solr and Lucene do not handle that caching. Systems external to Solr (like the OS, or HDFS) handle the caching. The cache effectiveness will be a

Re: Solr on HDFS: increase in query time with increase in data

2016-12-16 Thread Chetas Joshi
? Thanks! On Fri, Dec 16, 2016 at 6:52 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 12/14/2016 11:58 AM, Chetas Joshi wrote: > > I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have > > the following config. > > maxShardsperNode: 1 > > re

Re: Solr on HDFS: increase in query time with increase in data

2016-12-16 Thread Piyush Kunal
be split. > > On Wed, Dec 14, 2016 at 10:58 AM, Chetas Joshi <chetas.jo...@gmail.com> > wrote: > > > Hi everyone, > > > > I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have > > the following config. > > maxShardsperNode: 1 &g

Re: Solr on HDFS: increase in query time with increase in data

2016-12-15 Thread Reth RM
I think the shard index size is huge and should be split. On Wed, Dec 14, 2016 at 10:58 AM, Chetas Joshi <chetas.jo...@gmail.com> wrote: > Hi everyone, > > I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have > the following config. > maxShardsperNode:

Solr on HDFS: increase in query time with increase in data

2016-12-14 Thread Chetas Joshi
Hi everyone, I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have the following config. maxShardsperNode: 1 replicationFactor: 1 I have been ingesting data into Solr for the last 3 months. With increase in data, I am observing increase in the query time. Currently the size

Re: Solr on HDFS: adding a shard replica

2016-09-14 Thread Erick Erickson
<chetas.jo...@gmail.com> > wrote: > >> Hi, >> >> I just started experimenting with solr cloud. >> >> I have a solr cloud of 20 nodes. I have one collection with 18 shards >> running on 18 different nodes with replication factor=1. >> >> W

Re: Solr on HDFS: adding a shard replica

2016-09-13 Thread Chetas Joshi
gt; > I just started experimenting with solr cloud. > > I have a solr cloud of 20 nodes. I have one collection with 18 shards > running on 18 different nodes with replication factor=1. > > When one of my shards goes down, I create a replica using the Solr UI. On > HDFS I see a

Solr on HDFS: adding a shard replica

2016-09-13 Thread Chetas Joshi
Hi, I just started experimenting with solr cloud. I have a solr cloud of 20 nodes. I have one collection with 18 shards running on 18 different nodes with replication factor=1. When one of my shards goes down, I create a replica using the Solr UI. On HDFS I see a core getting added

Re: Solr and HDFS configuration

2015-03-24 Thread Michael Della Bitta
The ultimate answer is that you need to test your configuration with your expected workflow. However, the thing that mitigates the remote IO factor (hopefully) is that the Solr HDFS stuff features a blockcache that should (when tuned correctly) cache in RAM the blocks your Solr process needs

Solr and HDFS configuration

2015-03-24 Thread Joseph Obernberger
Hi All - does it make sense to run a solr shard on a node within an Hadoop cluster that is not a data node? In that case all the data that node processes would need to come over the network, but you get the benefit of more CPU for things like faceting. Thank you! -Joe

Re: Solr on HDFS in a Hadoop cluster

2015-01-08 Thread Charles VALLEE
Nanterre charles.val...@edf.fr Tél. : + (0) 1 78 66 69 81 Un geste simple pour l'environnement, n'imprimez ce message que si vous en avez l'utilité. De :otis.gospodne...@gmail.com A : solr-user@lucene.apache.org Date : 06/01/2015 18:55 Objet : Re: Solr on HDFS in a Hadoop cluster

Re: Solr on HDFS in a Hadoop cluster

2015-01-06 Thread Otis Gospodnetic
Hi Charles, See http://search-lucene.com/?q=solr+hdfs and https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Jan 6, 2015 at 11:02 AM

Solr on HDFS in a Hadoop cluster

2015-01-06 Thread Charles VALLEE
I am considering using Solr to extend Hortonworks Data Platform capabilities to search. - I found tutorials to index documents into a Solr instance from HDFS, but I guess this solution would require a Solr cluster distinct to the Hadoop cluster. Is it possible to have a Solr integrated

Re: Solr on HDFS in a Hadoop cluster

2015-01-06 Thread Otis Gospodnetic
://search-lucene.com/?q=solr+hdfs and https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Jan 6, 2015 at 11:02 AM, Charles VALLEE

Re: Solr and HDFS

2014-08-29 Thread nagyMarcelo
Details : local host is: scixd0021cld.itau/10.58.10.147; destination host is: scixd0021cld.itau:8022; -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-and-HDFS-tp4155470p4155872.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr and HDFS

2014-08-29 Thread Michael Della Bitta
; destination host is: scixd0021cld.itau:8022; -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-and-HDFS-tp4155470p4155872.html Sent from the Solr - User mailing list archive at Nabble.com.

Solr and HDFS

2014-08-27 Thread Leo Oliveira
Hello everyone! Simple question: is Solr 4.8 working with HDFS of Hadoop in CDH 5? Documentation page says that Solr would work with Hadoop 2.0.x, but it doesn't mention newer Hadoop versions. There's a comment of an user asking about this problem, but it's not sure for me whether there's an

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-07 Thread Ali Nazemian
remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-07 Thread Erick Erickson
difficult/impossible with Solr previously. Best, Erick On Tue, Aug 5, 2014 at 9:37 PM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-07 Thread Ali Nazemian
Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other purposes such as Analysis. So why we go for HDFS in the case of analysis

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-06 Thread Erick Erickson
opens up possibilities for working on data that were difficult/impossible with Solr previously. Best, Erick On Tue, Aug 5, 2014 at 9:37 PM, Ali Nazemian alinazem...@gmail.com wrote: Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS

solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Ali Nazemian
Dear all, Hi, I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these changes. Best regards. --

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Shawn Heisey
On 8/5/2014 7:04 AM, Ali Nazemian wrote: I changed solr 4.9 to write index and data on hdfs. Now I am going to connect to those data from the outside of solr for changing some of the values. Could somebody please tell me how that is possible? Suppose I am using Hbase over hdfs for do these

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Michael Della Bitta
Probably the most correct way to modify the index would be to use the Solr REST API to push your changes out. Another thing you might want to look at is Lilly. Basically it's a way to set up a Solr collection as an HBase replication target, so changes to your HBase table would automatically

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Ali Nazemian
Actually I am going to do some analysis on the solr data using map reduce. For this purpose it might be needed to change some part of data or add new fields from outside solr. On Tue, Aug 5, 2014 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 8/5/2014 7:04 AM, Ali Nazemian wrote: I

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Erick Erickson
What you haven't told us is what you mean by modify the index outside Solr. SolrJ? Using raw Lucene? Trying to modify things by writing your own codec? Standard Java I/O operations? Other? You could use SolrJ to connect to an existing Solr server and both read and modify at will form your M/R

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Ali Nazemian
Dear Erick, Hi, Thank you for you reply. Yeah I am aware that SolrJ is my last option. I was thinking about raw I/O operation. So according to your reply probably it is not applicable somehow. What about the Lily project that Michael mentioned? Is that consider SolrJ too? Are you aware of Cloudera

Re: solr over hdfs for accessing/ changing indexes outside solr

2014-08-05 Thread Ali Nazemian
Dear Erick, I remembered some times ago, somebody asked about what is the point of modify Solr to use HDFS for storing indexes. As far as I remember somebody told him integrating Solr with HDFS has two advantages. 1) having hadoop replication and HA. 2) using indexes and Solr documents for other

Re: SOLR on hdfs

2014-07-08 Thread shlash
Hi all, I am new to Solr and hdfs, actually, I am trying to index text content extracted from binary files like PDF, MS Office...etc which are stored on hdfs (single node), till now I've running Solr on HDFS, and create the core but I couldn't send the files to solr for indexing. Can someone

Re: solr cloud + hdfs issue

2014-01-21 Thread longsan
thanks. i think it's a good option for me. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-cloud-hdfs-issue-tp4111593p4112422.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr cloud + hdfs issue

2014-01-21 Thread Greg Walters
failure handle this gracefully and move to a DOWN state? On Jan 21, 2014, at 4:29 AM, longsan longsan...@sina.com wrote: thanks. i think it's a good option for me. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-cloud-hdfs-issue-tp4111593p4112422.html Sent

  1   2   >