If you consider what n-grams do this should make sense to you. Consider the
following piece of data:
White iPod
If the field is fed through a bigram filter (n-gram with size of 2) the
resulting token stream would appear as such:
wh hi it te
ip po od
The usual use of n-grams is to match
The limitations on how many threads you can use to load data is primarily
driven by factors on your hardware: CPU, heap usage, I/O, and the like. It is
common for most index load processes to be able to handle more incoming data on
the Solr side of the equation than can typically be loaded
As an endorsement of Erick's like, the primary benefit I see to processing
through your own code is better error-, exception-, and logging-handling which
is trivial for you to write.
Consider that your code could reside on any server, either receiving through a
PUSH or PULLing the data from
The best use case I see for atomic updates typically involves avoid
transmission of large documents for small field updates. If you are updating a
readCount field of a PDF document that is 1MB in size you will avoid
resending the 1MB PDF document's data in order to increment the readCount
Very specifically, what is the field definition that is being used for the
suggestions?
On Oct 10, 2013, at 5:49 AM, Furkan KAMACI furkankam...@gmail.com wrote:
What is your configuration for auto suggestion?
2013/10/10 ar...@skillnetinc.com ar...@skillnetinc.com
Hi,
We are
The shards.qt parameter is the easiest one to forget, with the most dramatic of
consequences!
On Oct 8, 2013, at 11:10 AM, shamik sham...@gmail.com wrote:
James,
Thanks for your reply. The shards.qt did the trick. I read the
documentation earlier but was not clear on the implementation,
I don't know if there's a way to accomplish your goal directly, but as a pure
workaround, you can write a routine to fetch all the stored values and resubmit
the document without the field in question. This is what atomic updates do,
minus the overhead of the transmission.
On Oct 7, 2013, at
fq=here:there OR this:that
For the lurker: an AND should be:
fq=here:therefq=this:that
While you can, technically, pass:
fq=here:there AND this:that
Solr will cache the separate fq= parameters and reuse them in any context. The
AND(ed) filter will be cached as a single
Utkarsh,
Check to see if the value is actually indexed into the field by using the Terms
request handler:
http://localhost:8983/solr/terms?terms.fl=textterms.prefix=d
(adjust the prefix to whatever you're looking for)
This should get you going in the right direction.
Jason
On Sep 17, 2013
They have modified the mechanisms for committing documents…Solr in DSE is not
stock Solr...so you are likely encountering a boundary where stock Solr
behavior is not fully supported.
I would definitely reach out to them to find out if they support the request.
On Sep 5, 2013, at 8:27 AM, Ryan,
The circumstance I've most typically seen the index.timestamp show up is when
an update is sent to a slave server. The replication then appears to preserve
the updated slave index in a separate folder while still respecting the correct
data from the master.
On Sep 5, 2013, at 8:03 PM, Shawn
One additional thought here: from a paranoid risk-management perspective it's
not a good idea to have two critical services dependent upon a single point of
failure if the hardware fails. Obviously risk-management is suited to taste,
so you may feel the cost/benefit does not merit the
(with
openSearcher=true) will work just fine (YMMV).
Jason
On Aug 14, 2013, at 4:51 AM, Erick Erickson erickerick...@gmail.com wrote:
right, SOLR-5081 is possible but somewhat unlikely
given the fact that you actually don't have very many
nodes in your cluster.
soft commits aren't relevant
It's been my experience that using they convenient feature to change the output
key still doesn't save you from having to map it back to the field name
underlying it in order to trigger the filter query. With that in mind it just
makes more sense to me to leave the effort in the View portion
While I don't have a past history of this issue to use as reference, if I were
in your shoes I would consider trying your updates with softCommit disabled.
My suspicion is you're experiencing some issue with the transaction logging and
how it's managed when your hard commit occurs.
If you can
The majority of the behavior outlined in that wiki page should work quite
sufficiently for 3.5.0. Note that there are only a few items that are marked
Solr4.0 only (DirectSolrSpellChecker and WordBreakSolrSpellChecker, for
example).
On Aug 9, 2013, at 6:26 AM, Kamaljeet Kaur
Or shingles, presuming you want to tokenize and output unigrams.
On Aug 2, 2013, at 11:33 AM, Walter Underwood wun...@wunderwood.org wrote:
Search against a field using edge N-grams. --wunder
On Aug 2, 2013, at 11:16 AM, T. Kuro Kurosaka wrote:
Is there a query parser that supports a
the result set you desire. Please beware that a very large
boolean set (your IN(…) parameter) may be expensive to run.
Jason
On Jul 29, 2013, at 7:33 AM, Benjamin Ryan benjamin.r...@manchester.ac.uk
wrote:
Hi,
Is it possible to construct a query in SOLR to perform a query
Nitin,
You need to ensure the fields you wish to see are marked stored=true in your
schema.xml file, and you should include fields in your fl= parameter
(fl=*,score is a good place to start).
Jason
On Jul 29, 2013, at 8:08 AM, Nitin Agarwal 2nitinagar...@gmail.com wrote:
Hi, I am using Solr
Or use the copyField technique to a single searchable field and set df= to that
field. The example schema does this with the field called text.
On Jul 29, 2013, at 8:35 AM, Ahmet Arslan iori...@yahoo.com wrote:
Hi,
df is a single valued parameter. Only one field can be a default field.
can either change the value to true, or alternatively call a deterministic
commit call at the end of your load (a solr/update?commit=true will default to
openSearcher=true).
Hope that's of use!
Jason
On Jul 15, 2013, at 9:52 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
I have a solr 4.3
cool.
so far I've been using the default collection 1 only.
thanks,
Jason
On Thu, Jul 11, 2013 at 7:57 AM, Erick Erickson erickerick...@gmail.comwrote:
Just use the address in the url. You don't have to use the core name
if the defaults are set, which is usually collection1.
So it's
primary key still exist? We don't want to have to always change the primary
key format to ensure a uniqueness of the primary key among all different
types of database tables.
thanks!
Jason
want to commit the data from table2 to a new core? Anyone knows
how I can do that?
thanks,
Jason
On Wed, Jul 10, 2013 at 11:18 AM, David Quarterman da...@corexe.com wrote:
Hi Jason,
Assuming you're using DIH, why not build a new, unique id within the query
to use as the 'doc_id' for SOLR
you can also call the file admin request handler:
http://localhost:8983/solr/admin/file?file=schema.xml
…and parse the whole stinking thing :)
Jason
On Jul 6, 2013, at 1:59 PM, Steven Glass steven.gl...@zekira.com wrote:
Does anyone have any idea how I can access the schema version info
/solr/4_3_1/solr-core/org/apache/solr/search/similarities/SweetSpotSimilarityFactory.html
Jason
On Jul 5, 2013, at 5:59 AM, pravesh suyalprav...@yahoo.com wrote:
Is there a way to omitNorms and still be able to use {!boost b=boost} ?
OR you could let /omitNorms=false/ as usual and have your
to isolate domain data to single shards so as
to allow isolated queries against dedicated data models in single shards.
But if you just want to basics, it really is as easy as describe above.
Jason
On Jul 5, 2013, at 7:36 PM, Ali, Saqib docbook@gmail.com wrote:
Hello Otis,
I was thinking
that approach?
Jason
On Jun 25, 2013, at 10:07 AM, Kevin Osborn kevin.osb...@cbsi.com wrote:
We are going to have two datacenters, each with their own SolrCloud and
ZooKeeper quorums. The end result will be that they should be replicas of
each other.
One method that has been mentioned is that we
Vinay,
What autoCommit settings do you have for your indexing process?
Jason
On Jun 24, 2013, at 1:28 PM, Vinay Pothnis poth...@gmail.com wrote:
Here is the ulimit -a output:
core file size (blocks, -c) 0 data seg size(kbytes,
-d) unlimited scheduling priority
the commit occurs at either
breakpoint. 30 seconds is plenty of time for 5 parallel processes of 20
document submissions to push you over the edge.
Jason
On Jun 24, 2013, at 2:21 PM, Vinay Pothnis poth...@gmail.com wrote:
I have 'softAutoCommit' at 1 second and 'hardAutoCommit' at 30 seconds
but with continued dialog
through channels like these there are fewer territories without good
cartography :)
Hope that's of use!
Jason
On Jun 24, 2013, at 7:12 PM, Scott Lundgren scott.lundg...@carbonblack.com
wrote:
Jason,
Regarding your statement push you over the edge- what does
Shalin,
There's one point to test without caches, which is to establish how much value
a cache actually provides.
For me, this primarily means providing a benchmark by which to decide when to
stop obsessing over caches.
But yes, for load testing I definitely agree :)
Jason
On Jun 21, 2013
by directory size (and not explicitly by the viewable
files) you may very well be seeing this.
Jason
On Jun 16, 2013, at 4:53 AM, Erick Erickson erickerick...@gmail.com wrote:
Optimzing will _temporarily_ double the index size,
but it shouldn't be permanent. Is it possible that
you have
with wildcard searches, or better yet NGram (EdgeNGram) behavior
to get the right suggestion data back.
I would suggest an additional core to accomplish this (fed via replication) to
avoid cache entry collision with your normal queries.
Hope that's useful to you.
Jason
On Jun 12, 2013, at 7:43 AM
(again, easily configured
via wildcard patterns) and then send the suggestion query to the right field.
Obviously this will get out of hand if you have too many of these...so this has
limits.
Jason
On Jun 11, 2013, at 8:29 AM, Aloke Ghoshal alghos...@gmail.com wrote:
Hi,
Trying to find a way
Roman,
Could you be more specific as to why replication doesn't meet your
requirements? It was geared explicitly for this purpose, including the
automatic discovery of changes to the data on the index master.
Jason
On Jun 4, 2013, at 1:50 PM, Roman Chyla roman.ch...@gmail.com wrote:
OK
you'd need a similar construct for each. I cannot attest to
performance at scale with such a construct…but just showing a way you can go
about this if you feel compelled enough to do so.
Jason
On Jun 3, 2013, at 8:08 AM, Jack Krupansky j...@basetechnology.com wrote:
No, but you can
Those are default, though autoSoftCommit is commented out by default.
Keep in mind about the hard commit running every 15 seconds: it is not
updating your searchable data (due to the openSearcher=false setting). In
theory, your data should be searchable due to autoSoftCommit running every 1
-robin distribute requests to other shards once
a query begins execution. But you do need an entry point externally to be
defined through your load balancer.
Hope this is useful!
Jason
On May 30, 2013, at 12:48 PM, James Dulin jdu...@crelate.com wrote:
Working to setup SolrCloud in Windows Azure
You have mentioned Pivot Facets, but have you looked at the Path Hierarchy
Tokenizer Factory:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PathHierarchyTokenizerFactory
This matches your use case, as best as I understand it.
Jason
On May 28, 2013, at 12:47 PM, vibhoreng04
absolutely isolated results for paragraphs, and give you a
great deal of flexibility on how to query the results in cases where you do or
do not need them grouped.
Jason
On May 28, 2013, at 3:10 PM, Hard_Club meddn...@gmail.com wrote:
Thanks, Alexandre.
But I need to know in which paragraph
Sam,
I would highly suggest counting the words in your external pipeline and sending
that value in as a specific field. It can then be queried quite simply with a:
wordcount:{80 TO *]
(Note the { next to 80, excluding the value of 80)
Jason
On May 22, 2013, at 11:37 AM, Sam Lee skyn
And use the /terms request handler to view what is present in the field:
/solr/terms?terms.fl=text_esterms.prefix=a
You're looking to ensure the index does, in fact, have the accented characters
present. It's just a sanity check, but could possibly save you a little
(sanity, that is).
Jason
Most definitely not the number of unique elements in each segment. My 32
document sample index (built from the default example docs data) has the
following:
entry#0:
'StandardDirectoryReader(segments_b:29 _8(4.2.1):C32)'='manu_exact',class
Rishi,
Fantastic! Thank you so very much for sharing the details.
Jason
On May 17, 2013, at 12:29 PM, Rishi Easwaran rishi.easwa...@aol.com wrote:
Hi All,
Its Friday 3:00pm, warm sunny outside and it was a good week. Figured I'd
share some good news.
I work for AOL mail team
The first rule of Solr without Unique Key is that we don't talk about Solr
without a Unique Key.
The second rule...
On May 16, 2013, at 8:47 PM, Jack Krupansky j...@basetechnology.com wrote:
Technically, core Solr does not require a unique key. A lot of features in
Solr do require unique
category…and, of course, the entire set of documents
considered for these facets is constrained by the current query.
I think this maps to your requirement.
Jason
On May 16, 2013, at 12:29 PM, David Larochelle
dlaroche...@cyber.law.harvard.edu wrote:
Is there a way to get aggregate word counts
Peter,
Thanks for taking the time to spell out what you were going
through. It's great to have details like to to mull over.
Jason
On
2013-05-14 12:44, Lee, Peter wrote:
Thank you one and all for your
input.
The problem we were tripping over turned out NOT to be
related to using
I have run across plenty of implementations using just about every common
servlet container on the market, and haven't run across any common problems to
dissuade you against any one of them.
On the JVM front most people seem to use Oracle because of it ubiquity. But I
have also run across a
You learned the gosh-darndest things:
http://localhost:8983/solr/browse?q=ipodboost=product(price,-2)debugQuery=on
…nets:
-0.3797992 = (MATCH) sum of:
0.13510442 = (MATCH) max of:
0.045963455 = (MATCH) weight(text:ipod^0.5 in 4) [DefaultSimilarity],
result of:
0.045963455 =
save a lot of headache and time.
Jason
On May 10, 2013, at 7:32 AM, Dyer, James james.d...@ingramcontent.com wrote:
Nicholas,
It sounds like you might want to use WordBreakSolrSpellChecker, which gets
obscure mention in the wiki. Read through this section:
http://wiki.apache.org/solr
of queries.
You may also want to consider having a Master/Slave relationship via
replication for higher availability. it is trivial to set up and works like a
charm.
Jason
On May 10, 2013, at 8:14 AM, milen.ti...@materna.de wrote:
Hello together!
I've been googleing on this topic
SolrCloud) configuration. You have a lot of options! But the
replication master/slave behavior is rock solid and does nearly everything you
seek.
Jason
On May 10, 2013, at 8:40 AM, milen.ti...@materna.de wrote:
Hello Jason,
Thanks for Your quick response! The alternative of using the Solr
One more tip on the use of filter queries.
DO: fq=name1:value1fq=name2:value2fq=namen:valuen
DON'T: fq=name1:value1 AND name2:value2 AND name3:value3
Where OR operators apply, this does not matter. But your Solr cache will be
much more savvy with the first construct.
Jason
On May 10, 2013
And for 10,000 documents across n shards, that can be significant!
On May 10, 2013, at 11:43 AM, Joel Bernstein joels...@gmail.com wrote:
How many shards are in your collection? The query aggregator node will pull
pack that results from each shard and hold the results in memory. Then it
will
Consider further that term vector data and highlighting becomes very useful if
you highlight externally to Solr. That is to say, you have the data stored
externally and wish to re-parse positions of terms (especially synonyms) from
source material. This is a (not too uncommon) technique used
the group.offset
parameter. This will shift the position in the returned array of documents to
the value provided.
Thus:
group.limit=1group.field=companyidgroup.offset=1
…would return the second item in each companyid group matching your current
query.
Jason
On May 9, 2013, at 10:30 AM, Luis
From:
http://lucene.apache.org/solr/4_3_0/changes/Changes.html#4.3.0.upgrading_from_solr_4.2.0
Slf4j/logging jars are no longer included in the Solr webapp. All logging jars
are now in example/lib/ext. Changing logging impls is now as easy as updating
the jars in this folder with those
Purely from empirical observation, both the DocumentCache and QueryResultCache
are being populated and reused in reloads of a simple MLT search. You can see
in the cache inserts how much extra-curricular activity is happening to
populate the MLT data by how many inserts and lookups occur on
If you nab the jars in example/lib/ext and place them within the appropriate
folder in Tomcat (and this will somewhat depend on which version of Tomcat you
are using…let's presume tomcat/lib as a brute-force approach) you should be
back in business.
On May 9, 2013, at 11:41 AM, richardg
lcguerreroc...@gmail.com wrote:
Thank you for the prompt reply jason. The group.offset parameter is working
for me, now I can iterate through all items for each company. The problem
I'm having right now is pagination. Is there a way how this can be
implemented out of the box with solr?
Before
I have to imagine I'm quibbling with the original assertion that Solr 4.x is
architected with a dependency on Zookeeper when I say the following:
Solr 4.x is not architected with a dependency on Zookeeper. SolrCloud,
however, is. As such, if a line of reasoning drives greater concern about
Hello.
I'm trying to figure out if Solr is going to work for a new project that I am
wanting to build. At it's heart it's a book text searching application. Each
book is broken into chapters and each chapter is broken into lines. I want to
be able to search these books and return relevant
, with fields for book,
chapter, page, and line number.
-- Jack Krupansky
-Original Message- From: Jason Funk
Sent: Tuesday, April 23, 2013 5:02 PM
To: solr-user@lucene.apache.org
Subject: Book text with chapter line number
Hello.
I'm trying to figure out if Solr is going to work
Hi, Upayavira
I know multiple segments are not problem.
But I always optimize index on master server before replicate.
So just single segment file is on master.
File lists of the master server directory are below.
Additionally, segments_1 and segments_2 on slave server are deleted by hand.
Hi, Erick
I didn't configure anything for index backup.
My ReplicationHandler configuration is below.
Other setting in solrconfig.xml is almost default.
Is there a deletion policy for replication?
I know maxNumberOfBackups parameter, but this is for master server.
Are there any configuration for
Hi,
I'm using master/slave replication on Solr 4.0.
Replication is successfully run.
But old index not cleaned up.
Is that bug or not?
My slave index directory is below...
$ ls -l solr_kr/krg01/data/index/
total 23472512
-rw-r--r--. 1 tomcat tomcat563722625 Dec 24 21:48 _15.fdt
-rw-r--r--.
I'm using master and slave server for scaling.
Master is dedicated for indexing and slave is for searching.
Now, I'm planning to move SolrCloud.
It has leader and replicas.
Leader acts like master and replicas acts like slave. Is it right?
so, I'm wondering two things.
First,
How can I assign
Hi,
I'm encountering below error repeatedly when trying out distributed search.
At that time, every server was not stalled.
Has anyone know what the problem is?
2012-10-18 09:09:54,813 [http-8080-exec-8819] ERROR
org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException:
the khugepaged is and why it's eating 100% cpu and
when it's run.
please someone explain to me.
Thanks,
Jason
--
View this message in context:
http://lucene.472066.n3.nabble.com/khugepaged-runnging-and-eating-100-cpu-tp4014635.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
We're using Solr 4.0 and servicing patent search.
Patent search intends to very complex queries including wildcard.
I think Ngram or EdgeNgram filter is alternative.
But every terms included a query don't have wildcard.
So we can't use that filter.
If I make empty core and use in main core
We're running 10 solr cores(c00,c01,...,c09) in a box and querying like
http://x.x.x.x/c00/select?q=testshards=c00,c01,..,c09
This means all of the result are merged in core c00.
Is this not good use in shards search?
When we analyze log file, query response time in core c00 is often too long.
How
Hi, Otis
Thanks your reply.
yes, all cores are in same server.
* what do you consider too long?
just id(key) query response takes too long.
almost id(key) query response takes under 10ms.
example
-
2012-10-05 16:38:32,078 [http-8080-exec-3979] INFO
is too big)?
thanks!
Jason
!
Jason
On Thu, Oct 4, 2012 at 1:36 PM, Tomás Fernández Löbbe
tomasflo...@gmail.com wrote:
SolrCloud doesn't auto-shard at this point. It doesn't split indexes either
(there is an open issue for this:
https://issues.apache.org/jira/browse/SOLR-3755 )
At this point you need to specify
Thanks Otis.
This starts to make more sense to me. I will go through the links in
your signature and dig into it.
Still learning but this is a good direction.
thanks!
Jason
On Thu, Oct 4, 2012 at 2:55 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Hi,
You could start with one node
Distinct in a distributed environment would require de-duplication
en-masse, use Hive or MapReduce instead.
On Wed, Sep 12, 2012 at 11:53 AM, yriveiro yago.rive...@gmail.com wrote:
Hi,
Exists the possibility of do a distinct group count in a grouping done using
a sharding schema?
This issue
or something like metasearch (I'm using Ruby on
Rails).
Jason
On Thu, Aug 9, 2012 at 5:49 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:
: Is it possible to connect to SOLR over a socket file as is possible
: with mysql? I've looked around and I get the feeling that I may be
: mi-understanding
).
Jason
On Tue, Aug 7, 2012 at 9:14 PM, Michael Kuhlmann k...@solarier.de wrote:
On 07.08.2012 21:43, Jason Axelson wrote:
Hi,
Is it possible to connect to SOLR over a socket file as is possible
with mysql? I've looked around and I get the feeling that I may be
mi-understanding part
Hi,
Is it possible to connect to SOLR over a socket file as is possible
with mysql? I've looked around and I get the feeling that I may be
mi-understanding part of SOLR's architecture.
Any pointers are welcome.
Thanks,
Jason
I've got SocketException(Connection reset) frequently.
This is occurred during distibuted search and logged like below in request
server.
At First, I thought that the reason of exception is long gc pause time of
jvm.
So I changed connectionTimeout of the connector in tomcat server.xml to
6ms.
Hi Hans
yes, that remote server is ok.
actually we got this error when remote server is executing garbage
collecting and that time is over about 1 minute.
remote server is very busy and memory usage is high.
--
View this message in context:
Actually we got this error when remote server is executing garbage collecting
and that time is over about 1 minute.
Solr server sometimes is frozen during gc and occurred connection refused
error.
Our gc option is -XX:+UseParallelGC -XX:+UseParallelOldGC
-XX:+AggressiveOpts
Response waiting is
Hi Amit,
If the caches were per-segment, then NRT would be optimal in Solr.
Currently the caches are stored per-multiple-segments, meaning after each
'soft' commit, the cache(s) will be purged.
On Fri, Jul 6, 2012 at 9:45 PM, Amit Nithian anith...@gmail.com wrote:
Sorry I'm a bit new to the
, as with some other Apache
licensed Lucene based search engines.
On Sat, Jul 7, 2012 at 10:42 AM, Yonik Seeley yo...@lucidimagination.comwrote:
On Sat, Jul 7, 2012 at 9:59 AM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
Currently the caches are stored per-multiple-segments, meaning after
to per-segment? How do I do that?
Thanks.
From: Jason Rutherglen jason.rutherg...@gmail.com
To: solr-user@lucene.apache.org
Sent: Saturday, July 7, 2012 11:32 AM
Subject: Re: Nrt and caching
The field caches are per-segment, which are used for sorting
Average should be doable in Solr, maybe not today, not sure. Median is the
challenge :) Try Hive.
On Sat, Jul 7, 2012 at 3:34 PM, Walter Underwood wun...@wunderwood.orgwrote:
It sounds like you need a database for analytics, not a search engine.
Solr cannot do aggregates like that. It can
://LinkedIn.com/in/JeremyBranham
http://jeremybranham.**wordpress.com/http://jeremybranham.wordpress.com/
http://Zeroth.biz
-Original Message- From: Jason Rutherglen
Sent: Saturday, July 07, 2012 2:45 PM
To: solr-user@lucene.apache.org
Subject: Re: Grouping and Averages
Average should
with activity regarding adding this
feature to Solr.
On Sat, Jul 7, 2012 at 8:32 PM, Andy angelf...@yahoo.com wrote:
Jason,
If I just use stock Solr 4.0 without modifying the source code, does that
mean multi-value faceting will be very slow when I'm constantly
inserting/updating documents
There isn't a solution for killing long running queries that works.
On Tue, Jun 5, 2012 at 1:34 AM, arin_g arin...@gmail.com wrote:
Hi,
We use solrcloud in production, and we are facing some issues with queries
that take very long specially deep paging queries, these queries keep our
servers
I think Datatax Enterprise is faster than Solr Cloud with transaction
logging turned on. Cassandra has it's own fast(er) transaction
logging mechanism. Of course it's best to use two HDs when testing,
eg, one for the data, the other for the transaction log.
On Fri, Apr 27, 2012 at 12:58 PM,
Hi,
I have a question concerning the spatial field type LatLonType and populating
it via an embedded solr server in java.
So far I've only ever had to index simple types like boolean, float, and
string. This is the first complex type. So I'd like to use the following field
definition for
indices, nodes and aliases on the fly I think there is a way how to handle
growing data set with ease. If anyone is interested such scenario has been
discussed in detail in ES mail list.
Regards,
Lukas
On Tue, Apr 17, 2012 at 2:42 AM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
One
not want to run into system X vs system Y flame here...)
Regards,
Lukas
On Wed, Apr 18, 2012 at 2:22 PM, Jason Rutherglen
jason.rutherg...@gmail.com wrote:
I'm curious how on the fly updates are handled as a new shard is added
to an alias. Eg, how does the system know to which shard
rearranging the
hash 'ring' both logically and physically.
In addition, there is the potential for data loss which Cassandra has
the technology for.
On Tue, Apr 17, 2012 at 1:33 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
I think Jason is right - there is no index splitting in ES and SolrCloud
One of big weaknesses of Solr Cloud (and ES?) is the lack of the
ability to redistribute shards across servers. Meaning, as a single
shard grows too large, splitting the shard, while live updates.
How do you plan on elastically adding more servers without this feature?
Cassandra and HBase
This was done in SOLR-1301 going on several years ago now.
On Sat, Apr 14, 2012 at 4:11 PM, Lance Norskog goks...@gmail.com wrote:
It sounds like you really want the final map/reduce phase to put Solr
index files into HDFS. Solr has a feature to do this called 'Embedded
Solr'. This packages
One thing that could fit the pattern you describe would be Solr caches
filling up and getting you too close to your JVM or memory limit
This [uncommitted] issue would solve that problem by allowing the GC
to collect caches that become too large, though in practice, the cache
setting would need
).
If you are fine with that, then your statements are contradictory.
On Thu, Jan 19, 2012 at 12:31 PM, Steven A Rowe sar...@syr.edu wrote:
Jason,
If I understand you correctly, you're referring to a thread
http://search-lucene.com/m/iMCFOqzcmS1/%22Performance+Monitoring+SaaS+for+Solr%22/v
Steven,
If you are going to admonish people for advertising, it should be
equally dished out or not at all.
On Wed, Jan 18, 2012 at 6:38 PM, Steven A Rowe sar...@syr.edu wrote:
Hi Peter,
Commercial solicitations are taboo here, except in the context of a request
for help that is directly
201 - 300 of 706 matches
Mail list logo