Hello everyone,
I´ve tested atomic updates via Ajax calls and now I´m starting with atomic
updates via SolrJ... but the way I´m proceeding doesn´t seem to work well.
Here is the snippet:
*SolrInputDocument do = ne SolrInputDocument();*
*doc.addField(id, myId);*
*
*
*MapString, ListString
If you are using DIH, is just doing (for a mysql project I have around for
example) something like this:
CONCAT(lat, ',',lon) as latlon
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-schema-xml-and-data-config-xml-tp4020376p4020437.html
Sent
Thread update:
When I use a simple:
*Map operation = new HashMap();*
Instead of:
*MapString, ListString operation = new HashMapString, ListString();*
The result looks better, but it´s still wrong:
fieldName: [
[Value1, Value2]
],
However, ListString value is received as a simple String
Hi - you're likely seeing a drop in performance because of durability which is
enabled by default via a transaction log. When disabled 4.0 is iirc slightly
faster than 3.x.
-Original message-
From:Nils Weinander nils.weinan...@gmail.com
Sent: Thu 15-Nov-2012 10:35
To:
Ah, thanks Markus!
That's a good thing. I tried disabling the transaction log, the difference
performance is marginal. So, I'll stick with the transaction logging.
On Thu, Nov 15, 2012 at 11:02 AM, Markus Jelsma
markus.jel...@openindex.iowrote:
Hi - you're likely seeing a drop in performance
On Thu, Nov 15, 2012 at 11:51 AM, Luis Cappa Banda luisca...@gmail.comwrote:
Thread update:
When I use a simple:
*Map operation = new HashMap();*
Instead of:
*MapString, ListString operation = new HashMapString,
ListString();*
The result looks better, but it´s still wrong:
Hello, Sami.
It will be the first issue that I open so, should I create it under Solr
4.0 version or in Solr 4.1.0 one?
Thanks,
- Luis Cappa.
2012/11/15 Sami Siren ssi...@gmail.com
On Thu, Nov 15, 2012 at 11:51 AM, Luis Cappa Banda luisca...@gmail.com
wrote:
Thread update:
When I use
Ok, done:
https://issues.apache.org/jira/browse/SOLR-4080
Regards,
- Luis Cappa.
2012/11/15 Luis Cappa Banda luisca...@gmail.com
Hello, Sami.
It will be the first issue that I open so, should I create it under Solr
4.0 version or in Solr 4.1.0 one?
Thanks,
- Luis Cappa.
2012/11/15
Actually it seems that xml/binary request writers only behave differently
when using array[] as the value. if I use ArrayList it also works with the
xml format (4.1 branch). Still it's annoying that the two request writers
behave differently so I guess it's worth adding the jira anyway.
The
I´ll have a look to Solr source code and try to fix the bug. If I succeed
I´ll update JIRA issue with it, :-)
2012/11/15 Sami Siren ssi...@gmail.com
Actually it seems that xml/binary request writers only behave differently
when using array[] as the value. if I use ArrayList it also works with
Seems like pivot faceting is what you looking for (
http://wiki.apache.org/solr/SimpleFacetParameters#Pivot_.28ie_Decision_Tree.29_Faceting
)
Note: it currently does not work in distributed mode - see
https://issues.apache.org/jira/browse/SOLR-2894
On Thu, Nov 15, 2012 at 7:46 AM, Jamie Johnson
Yes this is what I'm trying to do. But stuff related to the document like
language/title/...(i got way more fields) are stored many times. Each page
has a part of data that's the same is it possible to seperate that data?
--
View this message in context:
Thanks for the info, Mark. By a request won't return until it's affected
all replicas, are you referring to the update request or the query?
Bill
On Wed, Nov 14, 2012 at 7:57 PM, Mark Miller markrmil...@gmail.com wrote:
It's included as soon as it has been indexed - though a request won't
Hi, Sami.
Doing some tests I´ve used the same code as you and did a quick execution:
*HttpSolrServer server = new HttpSolrServer(
http://localhost:8080/solrserver/core1http://localhost:10080/newscover_es/items_es
);*
* *
* try {*
* *
* HashMap editTags = new HashMap();*
* editTags.put(set,
Try setting Request writer to binary like this:
server.setParser(new BinaryResponseParser());
server.setRequestWriter(new BinaryRequestWriter());
Or then instead of string array use ArrayListString() that contains your
strings as the value for the map
On Thu, Nov 15, 2012 at 3:58
Uhm, after setting both Response and Request Writers it worked OK with *
HttpSolrServer*. I´ve tried to find a way to set BinaryResponseParser and
BinaryRequestWriter with *CloudServer *(or even via *LbHttpSolrServer*) but
I found nothing.
Suggestions? :-/
- Luis Cappa.
2012/11/15 Sami Siren
:)
Just installed 3.6.1 and its working just fine.
Something should be wrong with my tomcat/solr install.
Thank you Robert.
//Frederico
-Mensagem original-
De: Robert Muir [mailto:rcm...@gmail.com]
Enviada: quarta-feira, 14 de Novembro de 2012 19:18
Para:
The particular JavaScript I referred to is this:
function processAdd(cmd) {
doc = cmd.solrDoc; // org.apache.solr.common.SolrInputDocument
lat = doc.getFieldValue(LATITUDE);
lon = doc.getFieldValue(LONGITUDE);
if (lat != null lon != null)
doc.setField(latLon, lat+,+lon);
}
I'm talking about an update request. So if you make an update, when it
returns, your next search will see the update, because it will be on
all replicas. Another process that is searching rapidly may see an
eventually consistent view though (very briefly). We have some ideas
to make that view more
Hi James,
Just gave it a go and it worked! That's the good news. The problem now is
getting it to work faster. It took over 2 hours just to index 4 views and i
need to get information from 26.
I tried adding the defaultRowPrefetch=2 as a jdbc parameter but it
does not seem to honour that. It
Hello,
I´ve found what It seems to be a bug
JIRA-SOLR4080https://issues.apache.org/jira/browse/SOLR-4080?focusedCommentId=13498055page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13498055
with
CloudSolrServer during atomic updates via SolrJ. Thanks to Sami I
Maybe you can start by testing this with split -l and xargs :-) These are
standard Unix toolkit approaches and since you use one of them (curl) you
may be happy to use others too.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn:
Hello,
I don't know if this is a bug or a missing feature, nor if it was corrected
in new versions of Solr (can't find any JIRA about it), so I just want to
show you the problem...
I can't test with Solr 4.0, I have a legacy system, not a lot of time, not
a Solr expert at all and it seems just
On Nov 15, 2012, at 8:02 AM, Sébastien Lorber lorber.sebast...@gmail.com
wrote:
entity name=PARAM query=SELECT key_name AS KEY, string_val AS
VALUE FROM BATCH_JOB_PARAMS WHERE JOB_INSTANCE_ID =
${JOB_EXEC.JOB_INSTANCE_ID}
field column=VALUE name=JOB_PARAM_${PARAM.KEY} /
Mark Miller-3 wrote
I'm talking about an update request. So if you make an update, when it
returns, your next search will see the update, because it will be on
all replicas.
I presume this is only the case if (of course) the client also sent a
commit. So you're saying the commit call will not
Here is our setup:
Solr 4.0
Master replicates to three slaves after optimize
We have a problem were every so often after replication the CPU load on the
Slave servers maxes out and request come to a crawl.
We do a dataimport every 10 minutes and depending on the number of updates
since the
Thanks for wrapping this up, it's always nice to get closure, especially
when it comes to googling G..
On Wed, Nov 14, 2012 at 5:34 AM, Adam Neal an...@mass.co.uk wrote:
Just to wrap up this one. Previously all the lib jars were located in the
war file on our setup, this was mainly to ease
Ah... sure, you can create a schema that has several different document
types in it, with extra fields that are used in some but not all documents -
books have the metadata fields but no page bodies while pages have page
bodies but no metadata. And maybe even do a Solr join for the block of
Well, what does maintenance entail? Changing schema? Rebuilding the index?
Many operations under the maintenance rubrik can be done with core admin
handler requests, see:
http://wiki.apache.org/solr/CoreAdmin
But if that doesn't solve your problem, then probably running in two
separate JVMs is
I think this is rather dangerous. How would these multiple slaves
coordinate replication? Would they all replicate at once? If only one was
configured to replicate, how would the others know to reopen serchers?
Furthermore, simply opening up more Solr instances on the same machine
isn't expanding
Gerald:
Here's the place to start: http://wiki.apache.org/solr/HowToContribute
But the basic setup is
1 create a JIRA login (anyone can)
2 create a JIRA if one doesn't exist
3 generate the patch. From your root level (the one that contains solr
and lucene dirs) and svn diff SOLR-###.patch wher
Currently you have to re-index all of your data. If you don't you'll have a
situation in which the same document (by uniqueKey) exists in two shards
and that document may show up twice in your results list.
NOTE: by reindex all your data, you need to _delete_ all your data first.
If you just add
Hello,
I would like implement a suggester with solr, which is the best way now
in your opinion?
thanks in advance
I.
-
Complicare è facile, semplificare é difficile.
Complicated is easy, simple is hard.
quote: http://it.wikipedia.org/wiki/Bruno_Munari
--
View this message in context:
It depends - no commit necessary for realtime get. Otherwise, yes, you would
need to do at least a soft commit. That works the same way though - so if you
make your update, then do a soft commit, you can be sure your next search will
see the update on all the replicas. And with realtime get, of
hi,
did you try setting your values in a List, for example ArrayList it should
work when you use that even without specifying reguest-/response writer.
--
Sami Siren
On Thu, Nov 15, 2012 at 4:56 PM, Luis Cappa Banda luisca...@gmail.comwrote:
Hello,
I´ve found what It seems to be a bug
Yes, my first attemp was with a ListString, but it didn´t work. Then I
started to try another ways such as a String[] array with no success.
Regards,
- Luis Cappa.
2012/11/15 Sami Siren ssi...@gmail.com
hi,
did you try setting your values in a List, for example ArrayList it should
work
Oh I'm sorry, I should have read your question more clearly. I totally
forgot that solr.PointType supports a configurable number of dimensions. If
you need more than 2 dimensions as your example shows you do, then you'll
have to resort to indexing your spatial data in another Solr core as
Hi yun
Not sure to understand your need...
There is no relationship between a query string and DIH.
What you want to achieve (if fetch 1 rows means select 1 rows
from a table) can be done by limiting the number of rows you SQL select
will return (the syntax differs from SGBD to SGBD).
Wasn't obvious ;).
Maybe you could try local params...something like
q={!q.op=OR%20rows=3}yourQueryHere
Hope this helps
Dom
2012/11/15 jefferyyuan yuanyun...@gmail.com
Thanks for the reply.
I am using SolrEntityProcessor to import data from another remote solr
server - not database, so
Hello Floyd,
There is a ton of research literature out there comparing BM25 to vector
space. But you have to be careful interpreting it.
BM25 originally beat the SMART vector space model in the early TRECs
because it did better tf and length normalization. Pivoted Document
Length
Depending on how much data you're pulling back, 2 hours might be a reasonable
amount of time. Of course if you had it a lot faster with Endeca Forge, I
can understand your questioning this. Keep in mind that the way you're setting
up, it will build each cache, 1 at a time. I'm pretty sure
Hi David,
thanks for your reply.
I've tested this datatype and the values are indexed fine (I'm using
6-dimensions points).
I'm trying to retrieve results and it works only with the 2 first dimensions
(X and Y), but it's not taking into account the others 4 dimensions.
I've been reading the
I figured out you can disable the core admin in solr.xml, but then it
breaks the admin as apparently it relies on that.
I tried tomcat security but haven't been able to make it work
I think as this point I may just write a query/debugging app that the
developers could use
On 11/13/2012
Borja,
Umm, I'm quite confused with the use-case you present.
~ David
-
Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context:
http://lucene.472066.n3.nabble.com/PointType-multivalued-query-tp4020445p4020609.html
Sent from the Solr -
Hi,
I have a question about the optimal way to distribute solr indexes across a
cloud. I have a small number of collections (less than 10). And a small
cluster (6 nodes), but each node has several disks - 5 of which I am using for
my solr indexes. The cluster is also a hadoop cluster, so the
Hi,
I think it's not a good idea to make Join operations between Solr cores
because of the performance (we managed a lot of data).
The point is that we want to store documents, each one with several
information sets (let's name them Points), each one identified by 6 values
(that's why I was
Sorry I tried to explain it too fast.
Imagine the usecase that I wrote on the first post.
A document can have more than one 6-Dimensions point. So my first approach
was:
doc
field name=pk1/field
field name=docId10/field
field name=point2,2,2,2,2,2/field
/doc
doc
field name=pk2/field
Sorry, you're out of luck. SRPT could be generalized but that's a bit of
work. The trickiest part I think would be writing a multi-dimensional
SpatialPrefixTree impl.
If the # of discrete values at each dimension is pretty small (100? ish?),
then there is a way using term positions and span
Personally I see no benefit to have more than one JVM per node, cores
can handle it. I would say that splitting a 20m index into 25 shards
strikes me as serious overkill, unless you expect to expand
significantly. 20m would likely be okay with two or three shards. You
can store the indexes for
One question is, why optimise? The newer TieredMergePolicy, as I
understand it, takes away much of the need for optimising an index.
As to maxing, after a replication, your caches need warming. Watch how
often you replicate, nd check on the admin UI how long it takes to warm
caches. You may be
The main reason to split a collection into 25 shards is to reduce the impact of
the loss of a disk. I was running an older version of solr, a disk went down,
and my entire collection was offline. Solr 4 offers shards.tolerant to reduce
the impact of the loss of a disk: fewer documents will be
Unfortunately, this doesn't seem to solve the issue; now I'm beginning
to wonder if maybe it's because I'm on Windows. Has anyone successfully
run ZkCLI on Windows?
Nick
On 11/12/2012 2:27 AM, Jeevanandam Madanagopal wrote:
Nick - Sorry, embedded links are not shown in previous email.
I think Bill was asking about search
I think the Q is whether the query hitting the shard where a doc was sent
for indexing would see that doc even before that doc has been copied to
replicas.
I didn't test it, but I'd think the answer would be positive because of the
xa log.
Otis
--
But slower indexing with solr 4.0 sounds suspicious to me... you compared
your configs? JVM parameters? GC? IO? CPU?
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 15, 2012 5:26 AM, Nils Weinander nils.weinan...@gmail.com wrote:
Ah, thanks Markus!
That's a good thing. I
Did you start from scratch, or did you bulk index into an existing index?
There is some backcompat logic in there, which is convenient, but not
necessarily the best performance.
-- Jack Krupansky
-Original Message-
From: Nils Weinander
Sent: Thursday, November 15, 2012 1:29 AM
To:
Hi,
I think here you want to use a single JVM per server - no need for multiple
JVMs, JVM per Collection and such.
If you can spread data over more than 1 disk on each of your servers,
great, that will help.
Re data loss - yes, you really should just be using replication. Sharding
a ton will
Hi Koji,
Thank you for your reply..will test for the same.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Patch-Needed-for-Issue-Solr-3790-tp4019256p4020651.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Iwo,
This is kind of a common question. Have a look at
http://search-lucene.com/?q=autocomplete+OR+suggesterfc_project=Solrfc_type=mail+_hash_+userfor
lots of discussions on this topic.
In short, you could use the Suggester that comes with Solr or you could do
Thanks everyone, especially to Tom, you do give me detailed explanation
about this topic.
Of course in academic we do need to interpret result carefully, what I care
about is from end-users point of view, using BM25 will result better
ranking instead of using lucene's original VSM+Boolean model?
First query is OK; it just doesn't fit your need if I understand
Could you confirm that the expected result is 6 rows (3 rows w/ppt plus 3
rows/pdf) ?
2012/11/15 jefferyyuan yuanyun...@gmail.com
Thanks :)
local param is very useful, but seems it doesn't work here:
I tried:
Scott,
I probably have no idea as to what I'm saying, but if you're looking for
finding results in a N-dimensional space, you might look at creating a
field of type 'point'. Point-type fields have a dimension attribute; I
believe that it can be set to a large integer value.
Barring that, there
Yun,
Literally you can call another QParser from the middle of a query and apply
local params to it via nested queries feature
http://searchhub.org/2009/03/31/nested-queries-in-solr/ syntax is little
bit tricky though.
But calling other QParser and attempting specify number of rows for it
makes
Scott,
It sounds like you need to look into few samples of similar things in
Lucene. On top of my head FuzzyQuery from 4.0, which finds terms similar to
the given in FST for query expansion. Generic query expansion is done via
MultiTermQuery. Index time terms expansion is shown in TrieField and
63 matches
Mail list logo