I need to store in SOLR all data of my clients mailing activitiy
The data contains meta data like From;To:Date;Time:Subject etc
I would easily have 1000 Million records every 2 months.
What I am currently doing is creating cores per client. So I have 400 cores
already.
Is this a good idea to
I happens once the server is fully started. And when it gets stuck
sometimes I have to restart the server, sometimes I'm able to edit the
solrconfig.xml and reload it.
Harun Reşit Zafer
TÜBİTAK BİLGEM BTE
Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü
T +90 262 675 3268
W
Hi Ramprasad,
You can certainly have a system with hundreds of cores. I know of more than
a few people who have done that successfully in their setups.
At the same time, I'd also recommend to you to have a look at SolrCloud.
SolrCloud takes away the operational pains like replication/recovery
On Tue, 2014-08-12 at 08:40 +0200, Ramprasad Padmanabhan wrote:
I need to store in SOLR all data of my clients mailing activitiy
The data contains meta data like From;To:Date;Time:Subject etc
I would easily have 1000 Million records every 2 months.
If standard searches are always inside a
I think this question is more aimed at design and performance of large
number of cores.
Also solr is designed to handle multiple cores effectively, however it
would be interesting to know If you have observed any performance problem
with growing number of cores, with number of nodes and solr
Hi Alex,
Thanks for your reply.
I'm comparing Solr vs. ElasticSearch.
Dose solr store index on hdfs in raw lucene format? I mean, if in that way,
we can get the index files from hdfs and directly put them into an
application based on lucene.
It seems that ElasticSearch dose not store the raw
I tried again to make sure. Server starts, I can see web admin gui but I
can't navigate btw tabs. It just says loading. But on the terminal
console everything seems normal.
Harun Reşit Zafer
TÜBİTAK BİLGEM BTE
Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü
T +90 262 675 3268
W
On Tue, 2014-08-12 at 01:27 +0200, dancoleman wrote:
My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are
some specs on my setup:
hosts: all are EC2 m1.large with 250G data volumes
Is that 3 (each running a primary and a replica shard) or 6 instances?
documents:
Are there documented benchmarks with number of cores
As of now I just have a test bed.
We have 150 million records ( will go up to 1000 M ) , distributed in 400
cores.
A single machine 16GB RAM + 16 cores search is working fine
But I still am not sure will this work fine in production
Hi,
is http://wiki.apache.org/solr/Support page immutable?
Dmitry
On Fri, Aug 8, 2014 at 4:24 PM, Jack Krupansky j...@basetechnology.com
wrote:
And the Solr Support list is where people register their available
consulting services:
http://wiki.apache.org/solr/Support
-- Jack Krupansky
Hi,
I have a TrieDateField where I want to store a date in -MM-dd format
as my source contains the date in same format.
As I understand TrieDateField stores date in -MM-dd'T'HH:mm:ss format
hence the date is getting formatted to the same.
Kindly let me know:
How can I change the
Use the parse date update request processor:
http://lucene.apache.org/solr/4_9_0/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html
Additional examples are in my e-book:
On Tue, 2014-08-12 at 11:50 +0200, Ramprasad Padmanabhan wrote:
Are there documented benchmarks with number of cores
As of now I just have a test bed.
We have 150 million records ( will go up to 1000 M ) , distributed in 400
cores.
A single machine 16GB RAM + 16 cores search is working
Sorry for missing information. My solr-cores take less than 200MB of disk
What I am worried about is If I run too many cores from a single solr
machine there will be a limit to the number of concurrent searches it can
support. I am still benchmarking for this.
Also another major bottleneck I
On Tue, 2014-08-12 at 14:14 +0200, Ramprasad Padmanabhan wrote:
Sorry for missing information. My solr-cores take less than 200MB of
disk
So ~3GB/server. If you do not have special heavy queries, high query
rate or heavy requirements for index availability, that really sounds
like you could
Hi Jack,
Thanks for your suggestion. I think the way I am using the
ParseDateFieldUpdateProcessorFactory is not right hence the date is not
getting transformed to the desired format.
I added following in solrconfig.xml and see no effect in search result. The
date is still in -MM-dd'T'HH:mm:ss
Hi Ramprasad,
I have used it in a cluster with millions of users (1 user per core) in
legacy cloud mode .We used the on demand core loading feature where each
Solr had 30,000 cores and at a time only 2000 cores were in memory. You are
just hitting 400 and I don't see much of a problem . What is
Hi Paul and Ramprasad,
I follow your discussion with interest as I will have more or less the
same requirement.
When you say that you use on demand core loading, are you talking about
LotsOfCore stuff?
Erick told me that it does not work very well in a distributed
environnement.
How do you
Harun,
What do you mean by the terminal console? Do you mean to say the admin gui
freezes but you can still issue queries to solr directly through your browser?
James Dyer
Ingram Content Group
(615) 213-4311
-Original Message-
From: Harun Reşit Zafer
On 12 August 2014 18:18, Noble Paul noble.p...@gmail.com wrote:
Hi Ramprasad,
I have used it in a cluster with millions of users (1 user per core) in
legacy cloud mode .We used the on demand core loading feature where each
Solr had 30,000 cores and at a time only 2000 cores were in memory.
I've been trying to debug through this but I'm stumped. I have a Solr index
with ~40 million documents indexed currently sitting idle. I update an
existing document through the web interface (collection1 - Documents -
/update) and the web request returns successfully. At this point, I expect
On 8/12/2014 3:57 AM, Dmitry Kan wrote:
Hi,
is http://wiki.apache.org/solr/Support page immutable?
All pages on that wiki are changeable by end users. You just need to
create an account on the wiki and then ask on this list to have your
wiki username added to the Contributor group.
Thanks,
Ramprasad Padmanabhan [ramprasad...@gmail.com] wrote:
I have a single machine 16GB Ram with 16 cpu cores
Ah! I thought you had more machines, each with 16 Solr cores.
This changes a lot. 400 Solr cores of ~200MB ~= 80GB of data. You're aiming for
7 times that, so about 500GB of data. Running
The response will always be the full specification,
so you'll have -MM-dd'T'HH:mm:ss format.
If you want the user to just see the -MM-dd
you could use a DocTransformer to change it on
the way out.
You cannot change the way the dates are stored
internally. The DateTransformer is just there
You havne't given us a lot of information to go on (ie: full
solrconfig.xml, log messages arround the tim of your update, etc...) but
my best guess would be that you are seeing a delay between the time the
new searcher is opened and the time the newSearcher is made available to
requests due
I'm not seeing any messages in the log with respect to cache warming at the
time, but I will investigate that possibility. Thank you. In case it is
helpful, I pasted the entire solrconfig.xml at http://pastebin.com/C0iQ7E9a
--
View this message in context:
: I'm not seeing any messages in the log with respect to cache warming at the
: time, but I will investigate that possibility. Thank you. In case it is
what logs *do* you see at the time you send the doc?
w/o details, we can't help you.
: helpful, I pasted the entire solrconfig.xml at
The machines were 32GB ram boxes. You must do the RAM requirement
calculation for your indexes . Just the no:of indexes alone won't be enough
to arrive at the RAM requirement
On Tue, Aug 12, 2014 at 6:59 PM, Ramprasad Padmanabhan
ramprasad...@gmail.com wrote:
On 12 August 2014 18:18, Noble
Immediately after triggering the update, this is what is in the logs:
/2014-08-12 12:58:48,774 | [71] | 153414367 [qtp2038499066-4772] INFO
org.apache.solr.update.processor.LogUpdateProcessor – [collection1]
webapp=/solr path=/update params={wt=json} {add=[52627624
(1476251068652322816)]} 0 34
I just pinged someone who really knows this stuff and the reply is that
he's copied the index from HDFS to a local file system in order to
inspect it with Luke, which means the bits on disk are identical and
may freely be copied back and forth. So I'd say go for it.
Erick
On Tue, Aug 12, 2014
Based on your solrconfig.xml settings for the filter and queryResult caches, I
believe Chris's initial guess is correct. After a commit, there is likely
plenty of time spent warming these caches due to the significantly high
autowarm counts.
filterCache class=solr.FastLRUCache
I have modified my instances to m2.4xlarge 64-bit with 68.4G memory. Hate to
ask this but can you recommend Java memory and GC settings for 90G data and
the above memory? Currently I have
CATALINA_OPTS=${CATALINA_OPTS} -XX:NewSize=1536m -XX:MaxNewSize=1536m
-Xms5120m -Xmx5120m -XX:+UseParNewGC
Hello,
Please provide me access.
User id vzhovtyuk
My email vzhovt...@gmail.com
Wiki user 'Vitaliy Zhovtyuk'
The field value is this:
20世紀の100人;ポートレートアーカイブス;政治家・軍人;政治家・指導
者・軍人;[政 治],100peopeof20century,pploftwentycentury,pploftwentycentury
The problem: We can't match this field with a search for
100peopeof20century. The analysis shows that there are three terms
indexed at the critical point by
On 8/12/2014 3:12 PM, tuxedomoon wrote:
I have modified my instances to m2.4xlarge 64-bit with 68.4G memory. Hate to
ask this but can you recommend Java memory and GC settings for 90G data and
the above memory? Currently I have
CATALINA_OPTS=${CATALINA_OPTS} -XX:NewSize=1536m
On 8/12/2014 3:29 PM, Vitaliy Zhovtiuk wrote:
Please provide me access.
User id vzhovtyuk
My email vzhovt...@gmail.com
Wiki user 'Vitaliy Zhovtyuk'
Wiki username added to the Solr wiki contributors group.You didn't
indicate exactly what kind of access you wanted, but that's the only
kind of
See the original message on this thread for full details. Some
additional information:
This happens on version 4.6.1, 4.7.2, and 4.9.0. Here is a screenshot
showing the analysis problem in more detail. The first line you can see
is the ICUTokenizer.
I'm very new to Solr, and could use a point in the right direction on a task
I've been assigned. I have a database containing customer information
(phone number, email address, credit card, billing address, shipping
address, etc.).
I need to be able to take user-entered data, and use it to
mmn
jnbbbjb)n9nooon
Sent from my HTC
- Reply message -
From: Shawn Heisey s...@elyograg.org
To: solr-user@lucene.apache.org solr-user@lucene.apache.org
Subject: ICUTokenizer acting very strangely with oriental characters
Date: Tue, Aug 12, 2014 19:00
See the original message on
Shawn,
ICUTokenizer is operating as designed here.
The key to understanding this is
o.a.l.analysis.icu.segmentation.ScriptIterator.isSameScript(), called from
ScriptIterator.next() with the scripts of two consecutive characters; these
methods together find script boundaries. Here’s
On 8/12/2014 6:29 PM, Steve Rowe wrote:
Shawn,
ICUTokenizer is operating as designed here.
The key to understanding this is
o.a.l.analysis.icu.segmentation.ScriptIterator.isSameScript(), called from
ScriptIterator.next() with the scripts of two consecutive characters; these
methods
In the table below, the IsSameS (is same script) and SBreak? (script
break = not IsSameS) decisions are based on what I mentioned in my previous
message, and the WBreak (word break) decision is based on UAX#29 word
break rules:
CharCode Point ScriptIsSameS?SBreak? WBreak?
Thanks Erick.
I will try.
--
View this message in context:
http://lucene.472066.n3.nabble.com/what-s-the-difference-between-solr-and-elasticsearch-in-hdfs-case-tp4152413p4152626.html
Sent from the Solr - User mailing list archive at Nabble.com.
And how many machines running the SOLR ?
On 12 August 2014 22:12, Noble Paul noble.p...@gmail.com wrote:
The machines were 32GB ram boxes. You must do the RAM requirement
And how many machines running the SOLR ?
I expect that I will have to add more servers. What I am looking for is how
44 matches
Mail list logo