Re: Commit Issue in Solr 3.4

2014-02-08 Thread samarth s
Yes it is amazon ec2 indeed.

To expqnd on that,
This solr deployment was working fine, handling the same load, on a 34 GB
instance on ebs storage for quite some time. To reduce the time taken by a
commit, I shifted this to a 30 GB SSD instance. It performed better in
writes and commits for sure. But, since the last week I started facing this
problem of infinite back to back commits. Not being able to resolve this, I
have finally switched back to a 34 GB machine with ebs storage, and now the
commits are working fine, though slow.

Any thoughts?
On 6 Feb 2014 23:00, Shawn Heisey s...@elyograg.org wrote:

 On 2/6/2014 9:56 AM, samarth s wrote:
  Size of index = 260 GB
  Total Docs = 100mn
  Usual writing speed = 50K per hour
  autoCommit-maxDocs = 400,000
  autoCommit-maxTime = 1500,000 (25 mins)
  merge factor = 10
 
  M/c memory = 30 GB, Xmx = 20 GB
  Server - Jetty
  OS - Cent OS 6

 With 30GB of RAM (is it Amazon EC2, by chance?) and a 20GB heap, you
 have about 10GB of RAM left for caching your Solr index.  If that server
 has all 260GB of index, I am really surprised that you have only been
 having problems for a short time.  I would have expected problems from
 day one.  Even if it only has half or one quarter of the index, there is
 still a major discrepancy in RAM vs. index size.

 You either need more memory or you need to reduce the size of your
 index.  The size of the indexed portion generally has more of an impact
 on performance than the size of the stored portion, but they do both
 have an impact, especially on indexing and committing.  With regular
 disks, it's best to have at least 50% of your index size available to
 the OS disk cache, but 100% is better.

 http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache

 If you are already using SSD, you might think there can't be
 memory-related performance problems ... but you still need a pretty
 significant chunk of disk cache.

 https://wiki.apache.org/solr/SolrPerformanceProblems#SSD

 Thanks,
 Shawn




Commit Issue in Solr 3.4

2014-02-06 Thread samarth s
Hi,

I have been using the solr version 3.4 in a project for about more than a
year. It is only now that I have started facing a weird problem of never
ending back to back commit cycles. I can say this looking at the InfoStream
logs, that, as soon as one commit cycle is done with another one almost
immediately spawns. My writer processes, which use solrj as the client, do
not get a chance to write even a single document between these commits. I
have waited for hours to let these commits take its own course and get over
in a natural way, but they dont. Finally, I had to restart the solr server.
Post that, my writers could get away with writing a few thousand docs,
after which the same infinite commit cycles start. Could not find any
related JIRA on this.

Size of index = 260 GB
Total Docs = 100mn
Usual writing speed = 50K per hour
autoCommit-maxDocs = 400,000
autoCommit-maxTime = 1500,000 (25 mins)
merge factor = 10

M/c memory = 30 GB, Xmx = 20 GB
Server - Jetty
OS - Cent OS 6


Please let me know if any other details are needed on the setup. Any help
is highly appreciated. Thanks.

-- 
Regards,
Samarth


Sum as a Projection for Facet Queries

2013-07-01 Thread samarth s
Hi,

We have a need of finding the sum of a field for each facet.query. We have
looked at StatsComponent http://wiki.apache.org/solr/StatsComponent but
that supports only facet.field. Has anyone written a patch over
StatsComponent that supports the same along with some performance measures?

Is there any way we can do this using the Function Query -
Sumhttp://wiki.apache.org/solr/FunctionQuery#sum
?

-- 
Regards,
Samarth


Re: updateLog in Solr 4.2

2013-04-14 Thread samarth s
I have a similar problem on this one. The reason for this is my application
performs back to back updates. And, as came out of my performance tests,
the update immediately after the first one, seems to be a lot slower than
as compared to not having any update logs.

Is this a genuine case, or did I miss something in my performance tests ?
Any pointers on this one are highly appreciated.



On Fri, Apr 12, 2013 at 6:47 PM, vicky desai vicky.de...@germinait.comwrote:

 If i disable update log in solr 4.2 then i get the following exception
 SEVERE: :java.lang.NullPointerException
 at

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190)
 at

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156)
 at

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100)
 at
 org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266)
 at
 org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935)
 at
 org.apache.solr.cloud.ZkController.register(ZkController.java:761)
 at
 org.apache.solr.cloud.ZkController.register(ZkController.java:727)
 at
 org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908)
 at
 org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892)
 at
 org.apache.solr.core.CoreContainer.register(CoreContainer.java:841)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

 Apr 12, 2013 6:39:56 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.cloud.ZooKeeperException:
 at
 org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:931)
 at
 org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892)
 at
 org.apache.solr.core.CoreContainer.register(CoreContainer.java:841)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.lang.NullPointerException
 at

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190)
 at

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156)
 at

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100)
 at
 org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266)
 at
 org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935)
 at
 org.apache.solr.cloud.ZkController.register(ZkController.java:761)
 at
 org.apache.solr.cloud.ZkController.register(ZkController.java:727)
 at
 org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908)
 ... 12 more

 and solr fails to start . However if i add updatelog in my solrconfig.xml
 it
 starts. Is the update log parameter mandatory for solr4.2



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/updateLog-in-Solr-4-2-tp4055548.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Samarth


Re: Error on using the projection parameter - fl - in Solr 4

2013-01-26 Thread samarth s
Thanks Erik.

Will take your advice as a long term solution. Currently working around by
using the the regex capability added in parsing the 'fl' parameter, using
'fl=E_*'


On Wed, Jan 9, 2013 at 6:07 AM, Erick Erickson erickerick...@gmail.comwrote:

 You really have a field name with '@' symbols in it? If it worked in 3.6,
 it was probably not intentional, classic undocumented behavior.

 The first thing I'd try is replacing the @ with __ in my schema...

 Best
 Erick

 On Tue, Jan 8, 2013 at 6:58 AM, samarth s samarth.s.seksa...@gmail.com
 wrote:

  q=*:*fl=E_abc@@xyz




-- 
Regards,
Samarth


Error on using the projection parameter - fl - in Solr 4

2013-01-08 Thread samarth s
Hi all,

I am in a process of migrating my application from Solr 3.6 to Solr 4. A
query that used to work is giving an error with Solr 4.

The query looks like:
q=*:*fl=E_abc@@xyz

The error displayed on the admin page is:
can not use FieldCache on multivalued field: E_abc

The field printed in the error has dropped the part after the character '@'.

Could not find any useful pointers on the forums, except one that has a
similar issue but while using the 'qt' parameter. Reference to this chain
is:
Subject: multivalued filed question (FieldCache error) on solr-user forums

Thanks for any pointers.

-- 
Regards,
Samarth


Re: Atomicity of commits (soft OR hard) across replicas - Solr Cloud

2013-01-07 Thread samarth s
Thanks *Tomás !! *This was useful.


On Mon, Dec 31, 2012 at 6:03 PM, Tomás Fernández Löbbe 
tomasflo...@gmail.com wrote:

 If by cronned commit you mean auto-commit: auto-commits are local to
 each node, are not distributed, so there is no something like a
 cluster-wide atomicity there. The commit may be performed in one node
 now, and in other nodes in 5 minutes (depending on the maxTime you have
 configured).
 If you mean that you are issuing commits from outside Solr, those are going
 to be by default distributed to all the nodes. The operation will succeed
 only if all nodes succeed, but if one of the nodes fail, the operation will
 fail. However, the nodes that did succeed WILL have a new view of the index
 at this point. (I'm not sure if something is done in this situation with
 the failing node).

 The local commit operation in one node *is* atomic.

 Tomás


 On Mon, Dec 31, 2012 at 7:04 AM, samarth s samarth.s.seksa...@gmail.com
 wrote:

  Tried reading articles online, but could not find one that confirmed the
  same 100% :).
 
  Does a cronned soft commit complete its commit cycle only after all the
  replicas have the newest data visible ?
 
  --
  Regards,
  Samarth
 




-- 
Regards,
Samarth


Solr cloud in 4.0 with NRT performance

2012-09-14 Thread samarth s
Hi,

I am currently using features like facet and group/collapse on solr 3.6.
The frequency of writing is user driven, and hence is expected to be
visible real time or at least near real time. These updates should be
consistent in facet and group results as well. Also to handle the query
load, I may have to use replication/sharding w/ or w/o solr cloud.

I am planning to migrate to solr 4.0, and use its powerful features of NRT
( soft commit ) and Solr Cloud ( using Zookeeper ) to achieve the above
requirements.

Is a Solr Cloud with a replication level greater than 1, capable of giving
NRT results ?
If yes, do these NRT results work with all kinds of querying, like,
faceting and grouping ?

It would be great if some one could share their insights and numbers on
these questions.

-- 
Regards,
Samarth


Re: Solr 4.0 Beta Release

2012-09-13 Thread samarth s
Thanks Jack.

On Wed, Sep 12, 2012 at 8:08 PM, Jack Krupansky j...@basetechnology.comwrote:

 Yes, it has been released. Read the details here (including download
 instructions/links):
 http://lucene.apache.org/solr/**solrnews.htmlhttp://lucene.apache.org/solr/solrnews.html

 -- Jack Krupansky

 -Original Message- From: samarth s
 Sent: Wednesday, September 12, 2012 9:54 AM
 To: solr-user@lucene.apache.org
 Subject: Solr 4.0 Beta Release


 Hi All,

 Would just like to verify if Solr 4.0 Beta has been released. Does the
 following url give the official beta release:
 http://www.apache.org/dyn/**closer.cgi/lucene/solr/4.0.0-**BETAhttp://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA

 --
 Regards,
 Samarth




-- 
Regards,
Samarth


Solr 4.0 Beta Release

2012-09-12 Thread samarth s
Hi All,

Would just like to verify if Solr 4.0 Beta has been released. Does the
following url give the official beta release:
http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA

-- 
Regards,
Samarth


Re: Too many connections in CLOSE_WAIT state on master solr server

2012-03-18 Thread samarth s
Hi Ranveer,

You can try this '-Dhttp.maxConnections' out, may resolve the issue.
But the root cause I figured may lie with some queries made to solr
that are too heavy to have decent turnaround times. As a result the
client may close the connection abruptly, resulting in half closed
connections. You can also try adding search time out to solr queries:
https://issues.apache.org/jira/browse/SOLR-502

On Tue, Jan 10, 2012 at 8:06 AM, Ranveer ranveer.s...@gmail.com wrote:
 Hi,

 I am facing same problem. Did  -Dhttp.maxConnections resolve the problem ?

 Please let us know!

 regards
 Ranveer



 On Thursday 15 December 2011 11:30 AM, samarth s wrote:

 Thanks Erick and Mikhail. I'll try this out.

 On Wed, Dec 14, 2011 at 7:11 PM, Erick Ericksonerickerick...@gmail.com
  wrote:

 I'm guessing (and it's just a guess) that what's happening is that
 the container is queueing up your requests while waiting
 for the other connections to close, so Mikhail's suggestion
 seems like a good idea.

 Best
 Erick

 On Wed, Dec 14, 2011 at 12:28 AM, samarth s
 samarth.s.seksa...@gmail.com  wrote:

 The updates to the master are user driven, and are needed to be
 visible quickly. Hence, the high frequency of replication. It may be
 that too many replication requests are being handled at a time, but
 why should that result in half closed connections?

 On Wed, Dec 14, 2011 at 2:47 AM, Erick Ericksonerickerick...@gmail.com
  wrote:

 Replicating 40 cores every 20 seconds is just *asking* for trouble.
 How often do your cores change on the master? How big are
 they? Is there any chance you just have too many cores replicating
 at once?

 Best
 Erick

 On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev
 mkhlud...@griddynamics.com  wrote:

 You can try to reuse your connections (prevent them from closing) by
 specifying
  -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN
 in jvm startup params. At client JVM!. Number should be chosen
 considering
 the number of connection you'd like to keep alive.

 Let me know if it works for you.

 On Tue, Dec 13, 2011 at 2:57 PM, samarth
 ssamarth.s.seksa...@gmail.comwrote:

 Hi,

 I am using solr replication and am experiencing a lot of connections
 in the state CLOSE_WAIT at the master solr server. These disappear
 after a while, but till then the master solr stops responding.

 There are about 130 open connections on the master server with the
 client as the slave m/c and all are in the state CLOSE_WAIT. Also,
 the
 client port specified on the master solr server netstat results is
 not
 visible in the netstat results on the client (slave solr) m/c.

 Following is my environment:
 - 40 cores in the master solr on m/c 1
 - 40 cores in the slave solr on m/c 2
 - The replication poll interval is 20 seconds.
 - Replication part in solrconfig.xml in the slave solr:
 requestHandler name=/replication class=solr.ReplicationHandler
           lst name=slave

                   !--fully qualified url for the replication handler
 of master--
                   str
 name=masterUrl$mastercorename/replication/str

                   !--Interval in which the slave should poll master
 .Format is HH:mm:ss . If this is absent slave does not poll
 automatically.
                                But a fetchindex can be triggered from
 the admin or the http API--
                   str name=pollInterval00:00:20/str
                   !-- The following values are used when the slave
 connects to the master to download the index files.
                               Default values implicitly set as 5000ms
 and 1ms respectively. The user DOES NOT need to specify
                               these unless the bandwidth is extremely
 low or if there is an extremely high latency--
                   str name=httpConnTimeout5000/str
                   str name=httpReadTimeout1/str
          /lst
   /requestHandler

 Thanks for any pointers.

 --
 Regards,
 Samarth



 --
 Sincerely yours
 Mikhail Khludnev
 Developer
 Grid Dynamics
 tel. 1-415-738-8644
 Skype: mkhludnev
 http://www.griddynamics.com
  mkhlud...@griddynamics.com



 --
 Regards,
 Samarth







-- 
Regards,
Samarth


Request Timeout Parameter in update queries

2012-03-16 Thread samarth s
Hi,

Does an update query to solr work well when sent with a timeout
parameter ? https://issues.apache.org/jira/browse/SOLR-502
For example, consider an update query was fired with a timeout of 30
seconds, and the request got aborted half way due to the timeout. Can
this corrupt the index in any way ?

-- 
Regards,
Samarth


Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-14 Thread samarth s
Thanks Erick and Mikhail. I'll try this out.

On Wed, Dec 14, 2011 at 7:11 PM, Erick Erickson erickerick...@gmail.com wrote:
 I'm guessing (and it's just a guess) that what's happening is that
 the container is queueing up your requests while waiting
 for the other connections to close, so Mikhail's suggestion
 seems like a good idea.

 Best
 Erick

 On Wed, Dec 14, 2011 at 12:28 AM, samarth s
 samarth.s.seksa...@gmail.com wrote:
 The updates to the master are user driven, and are needed to be
 visible quickly. Hence, the high frequency of replication. It may be
 that too many replication requests are being handled at a time, but
 why should that result in half closed connections?

 On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson erickerick...@gmail.com 
 wrote:
 Replicating 40 cores every 20 seconds is just *asking* for trouble.
 How often do your cores change on the master? How big are
 they? Is there any chance you just have too many cores replicating
 at once?

 Best
 Erick

 On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
 You can try to reuse your connections (prevent them from closing) by
 specifying  
 -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN
 in jvm startup params. At client JVM!. Number should be chosen considering
 the number of connection you'd like to keep alive.

 Let me know if it works for you.

 On Tue, Dec 13, 2011 at 2:57 PM, samarth s 
 samarth.s.seksa...@gmail.comwrote:

 Hi,

 I am using solr replication and am experiencing a lot of connections
 in the state CLOSE_WAIT at the master solr server. These disappear
 after a while, but till then the master solr stops responding.

 There are about 130 open connections on the master server with the
 client as the slave m/c and all are in the state CLOSE_WAIT. Also, the
 client port specified on the master solr server netstat results is not
 visible in the netstat results on the client (slave solr) m/c.

 Following is my environment:
 - 40 cores in the master solr on m/c 1
 - 40 cores in the slave solr on m/c 2
 - The replication poll interval is 20 seconds.
 - Replication part in solrconfig.xml in the slave solr:
 requestHandler name=/replication class=solr.ReplicationHandler 
           lst name=slave

                   !--fully qualified url for the replication handler
 of master--
                   str name=masterUrl$mastercorename/replication/str

                   !--Interval in which the slave should poll master
 .Format is HH:mm:ss . If this is absent slave does not poll
 automatically.
                                But a fetchindex can be triggered from
 the admin or the http API--
                   str name=pollInterval00:00:20/str
                   !-- The following values are used when the slave
 connects to the master to download the index files.
                               Default values implicitly set as 5000ms
 and 1ms respectively. The user DOES NOT need to specify
                               these unless the bandwidth is extremely
 low or if there is an extremely high latency--
                   str name=httpConnTimeout5000/str
                   str name=httpReadTimeout1/str
          /lst
   /requestHandler

 Thanks for any pointers.

 --
 Regards,
 Samarth




 --
 Sincerely yours
 Mikhail Khludnev
 Developer
 Grid Dynamics
 tel. 1-415-738-8644
 Skype: mkhludnev
 http://www.griddynamics.com
  mkhlud...@griddynamics.com



 --
 Regards,
 Samarth



-- 
Regards,
Samarth


Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread samarth s
Hi,

I am using solr replication and am experiencing a lot of connections
in the state CLOSE_WAIT at the master solr server. These disappear
after a while, but till then the master solr stops responding.

There are about 130 open connections on the master server with the
client as the slave m/c and all are in the state CLOSE_WAIT. Also, the
client port specified on the master solr server netstat results is not
visible in the netstat results on the client (slave solr) m/c.

Following is my environment:
- 40 cores in the master solr on m/c 1
- 40 cores in the slave solr on m/c 2
- The replication poll interval is 20 seconds.
- Replication part in solrconfig.xml in the slave solr:
requestHandler name=/replication class=solr.ReplicationHandler 
  lst name=slave

  !--fully qualified url for the replication handler
of master--
  str name=masterUrl$mastercorename/replication/str

  !--Interval in which the slave should poll master
.Format is HH:mm:ss . If this is absent slave does not poll
automatically.
   But a fetchindex can be triggered from
the admin or the http API--
  str name=pollInterval00:00:20/str
  !-- The following values are used when the slave
connects to the master to download the index files.
  Default values implicitly set as 5000ms
and 1ms respectively. The user DOES NOT need to specify
  these unless the bandwidth is extremely
low or if there is an extremely high latency--
  str name=httpConnTimeout5000/str
  str name=httpReadTimeout1/str
         /lst
  /requestHandler

Thanks for any pointers.

--
Regards,
Samarth


Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread samarth s
The updates to the master are user driven, and are needed to be
visible quickly. Hence, the high frequency of replication. It may be
that too many replication requests are being handled at a time, but
why should that result in half closed connections?

On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson erickerick...@gmail.com wrote:
 Replicating 40 cores every 20 seconds is just *asking* for trouble.
 How often do your cores change on the master? How big are
 they? Is there any chance you just have too many cores replicating
 at once?

 Best
 Erick

 On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
 You can try to reuse your connections (prevent them from closing) by
 specifying  
 -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN
 in jvm startup params. At client JVM!. Number should be chosen considering
 the number of connection you'd like to keep alive.

 Let me know if it works for you.

 On Tue, Dec 13, 2011 at 2:57 PM, samarth s 
 samarth.s.seksa...@gmail.comwrote:

 Hi,

 I am using solr replication and am experiencing a lot of connections
 in the state CLOSE_WAIT at the master solr server. These disappear
 after a while, but till then the master solr stops responding.

 There are about 130 open connections on the master server with the
 client as the slave m/c and all are in the state CLOSE_WAIT. Also, the
 client port specified on the master solr server netstat results is not
 visible in the netstat results on the client (slave solr) m/c.

 Following is my environment:
 - 40 cores in the master solr on m/c 1
 - 40 cores in the slave solr on m/c 2
 - The replication poll interval is 20 seconds.
 - Replication part in solrconfig.xml in the slave solr:
 requestHandler name=/replication class=solr.ReplicationHandler 
           lst name=slave

                   !--fully qualified url for the replication handler
 of master--
                   str name=masterUrl$mastercorename/replication/str

                   !--Interval in which the slave should poll master
 .Format is HH:mm:ss . If this is absent slave does not poll
 automatically.
                                But a fetchindex can be triggered from
 the admin or the http API--
                   str name=pollInterval00:00:20/str
                   !-- The following values are used when the slave
 connects to the master to download the index files.
                               Default values implicitly set as 5000ms
 and 1ms respectively. The user DOES NOT need to specify
                               these unless the bandwidth is extremely
 low or if there is an extremely high latency--
                   str name=httpConnTimeout5000/str
                   str name=httpReadTimeout1/str
          /lst
   /requestHandler

 Thanks for any pointers.

 --
 Regards,
 Samarth




 --
 Sincerely yours
 Mikhail Khludnev
 Developer
 Grid Dynamics
 tel. 1-415-738-8644
 Skype: mkhludnev
 http://www.griddynamics.com
  mkhlud...@griddynamics.com



-- 
Regards,
Samarth


Re: Solr Open File Descriptors

2011-10-22 Thread samarth s
Thanks for sharing your insights shawn

On Mon, Oct 17, 2011 at 1:27 AM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2011 12:01 PM, samarth s wrote:

 Hi,

 Is it safe to assume that with a megeFactor of 10 the open file
 descriptors
 required by solr would be around (1+ 10) * 10 = 110
 ref: *http://onjava.com/pub/a/**onjava/2003/03/05/lucene.html#**
 indexing_speed*http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
 Solr wiki:
 http://wiki.apache.org/solr/**SolrPerformanceFactors#**Optimization_**
 Considerationsstateshttp://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates

 that FD's required per segment is around 7.

 Are these estimates appropriate. Does it in anyway depend on the size of
 the
 index  number of docs (assuming same number of segments in any case) as
 well?


 My index has 10 files per normal  segment (the usual 7 plus three more for
 termvectors).  Some of the segments also have a .del file, and there is a
 segments_* file and a segments.gen file.  Your servlet container and other
 parts of the OS will also have to open files.

 I have personally seen three levels of segment merging taking place at the
 same time on a slow filesystem during a full-import, along with new content
 coming in at the same time.  With a mergefactor of 10, each merge is 11
 segments - the ten that are being merged and the merged segment.  If you
 have three going on at the same time, that's 33 segments, and you can have
 up to 10 more that are actively being built by ongoing index activity, so
 that's 43 potential segments.  If your filesystem is REALLY slow, you might
 end up with even more segments as existing merges are paused for new ones to
 start, but if you run into that, you'll want to udpate your hardware, so I
 won't consider it.

 Multiplying 43 segments by 11 files per segment yields a working
 theoretical maximum of 473 files.  Add in the segments files, you're up to
 475.

 Most operating systems have a default FD limit that's at least 1024.  If
 you only have one index (core) on your Solr server, Solr is the only thing
 running on that server, and it's using the default mergeFactor of 10, you
 should be fine with the default.  If you are going to have more than one
 index on your Solr server (such as a build core and a live core), you plan
 to run other things on the server, or you want to increase your mergeFactor
 significantly, you might need to adjust the OS configuration to allow more
 file descriptors.

 Thanks,
 Shawn




-- 
Regards,
Samarth


Solr Open File Descriptors

2011-10-16 Thread samarth s
Hi,

Is it safe to assume that with a megeFactor of 10 the open file descriptors
required by solr would be around (1+ 10) * 10 = 110
ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
Solr wiki:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
that FD's required per segment is around 7.

Are these estimates appropriate. Does it in anyway depend on the size of the
index  number of docs (assuming same number of segments in any case) as
well?


-- 
Regards,
Samarth


Field Cache

2011-05-07 Thread samarth s
Hi,

I have read lucene field cache is used in faceting and sorting. Is it also
populated/used when only selected fields are retrieved using the 'fl' OR
'included fields in collapse' parameters? Is it also used for collapsing?

-- 
Regards,
Samarth


Re: exception obtaining write lock on startup

2011-01-17 Thread samarth s
In that case why is there a separate lock factory of SingleInstanceLockFactory?

On Fri, Dec 31, 2010 at 6:25 AM, Lance Norskog goks...@gmail.com wrote:
 This will not work. At all.

 You can only have one Solr core instance changing an index.

 On Thu, Dec 30, 2010 at 4:38 PM, Tri Nguyen tringuye...@yahoo.com wrote:
 Hi,

 I'm getting this exception when I have 2 cores as masters.  Seems like one 
 of the cores obtains a lock (file) and then the other tries to obtain the 
 same one.   However, the first one is not deleted.

 How do I fix this?

 Dec 30, 2010 4:34:48 PM org.apache.solr.handler.ReplicationHandler inform
 WARNING: Unable to get IndexCommit on startup
 org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: 
 Native
 FSLock@..\webapps\solr\tnsolr\data\index\lucene-fe3fc928a4bbfeb55082e49b32a70c10
 -write.lock
     at org.apache.lucene.store.Lock.obtain(Lock.java:85)
     at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565)
     at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1421)
     at 
 org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:19
 1)
     at 
 org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHand
 ler.java:98)
     at 
 org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHa
 ndler2.java:173)
     at 
 org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpd
 ateHandler2.java:376)
     at 
 org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.


 Tri



 --
 Lance Norskog
 goks...@gmail.com



Collapsing with start, rows parameters

2010-12-29 Thread samarth s
Hi,

I am using collapsing with start  rows parameters.
For start=0  rows=10 my query looks like:
q=f1:v1+AND+f2:v2date:[*+TO+*]rows=10start=0fl=rootIdcollapse.field=rootIdcollapse.threshold=1collapse.type=normalcollapse.includeCollapsedDocs.fl=id

The same query with start=10, gives me an overlapping result. i.e.
Last two of the first query's collapse groups are appearing in the
second query's results as the first two groups. With increasing the
value of the start parameter, the number of overlapping groups changes
somewhat arbitrarily. Sometimes gives 3,5,etc.

I am working on a patch for SOLR-236 dated 2010-06-17 03:08 PM. Has
this been an issue that has been fixed?

Thanks for any pointers,
Samarth


Total number of groups after collapsing

2010-12-23 Thread samarth s
Hi,

I have been using collapsing in my application. I have a requirement of
finding the no of groups matching some filter criteria.
Something like a COUNT(DISTINCT columnName). The only solution I can
currently think of is using the query:

q=*:*rows=Integer.MAX_VALUEstart=0fl=scorecollapse.field=abccollapse.threshold=1collapse.type=normal

I get the number of groups from 'numFound', but this seems like a bad
solution in terms of performance. Is there a cleaner way?

Thanks,
Samarth


Re: Total number of groups after collapsing

2010-12-23 Thread samarth s
Hi,

I figured out a better way of doing it. The following query would be a
better option:
q=*:*start=2147483647rows=0collapse=truecollapse.field=abccollapse.threshold=1

Thanks,
Samarth

On Thu, Dec 23, 2010 at 8:57 PM, samarth s samarth.s.seksa...@gmail.comwrote:

 Hi,

 I have been using collapsing in my application. I have a requirement of
 finding the no of groups matching some filter criteria.
 Something like a COUNT(DISTINCT columnName). The only solution I can
 currently think of is using the query:


 q=*:*rows=Integer.MAX_VALUEstart=0fl=scorecollapse.field=abccollapse.threshold=1collapse.type=normal

 I get the number of groups from 'numFound', but this seems like a bad
 solution in terms of performance. Is there a cleaner way?

 Thanks,
 Samarth



Glob in fl parameter

2010-12-22 Thread samarth s
Hi,

Is there any support for glob in the 'fl' param. This would be very useful
in case of retrieving dynamic fields. I have read the wiki for
FieldAliasesAndGlobsInParams. Is there any related patch?

Thanks for any pointers,
Samarth


Re: solr dynamic core creation

2010-11-21 Thread samarth s
Hi nizan,

I have the same requirement of creating cores on the fly. Was looking
for some API provided by http solr server. Currently working around by
writing my own shell script on the server (solr server :) ). Any
better leads on the same?

Thanks,
Samarth

On Thu, Nov 11, 2010 at 9:27 PM, Robert Sandiford
bob.sandif...@sirsidynix.com wrote:

 No - in reading what you just wrote, and what you originally wrote, I think
 the misunderstanding was mine, based on the architecture of my code.  In my
 code, it is our 'server' level that does the SolrJ indexing calls, but you
 meant 'server' to be the Solr instance, and what you mean by 'client' is
 what I was thinking of (without thinking) as the 'server'...

 Sorry about that.  Hopefully someone else can chime in on your specific
 issue...
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1883354.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Dynamically create new core

2010-11-02 Thread samarth s
Hi,


I have a requirement of dynamically creating new cores(master). Each
core should have a replicated slave core.
I am working with Java and using SolrJ as my solr client. I came
across CoreAdminRequest class and looks like the way to go.

CoreAdminRequest.createCore(NewCore1, NewCore1, solrServer);
creates a new core programmatically.

Also, for the newly created core, I want to use an existing
solrconfig.xml  modify certain parameters. Can I achieve this using
SolrJ?

Are there any better approaches for the requirement?

Thanks for any pointers,