Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-22 Thread Susheel Kumar
{ "EOF": true, "RESPONSE_TIME": 18 } ] } } And after uncomment var d above, even though we are displaying a, we get results shown below. I understand that join in my test data didn't find any match but then it should not skew up the results of var a. W

Index 0, Size 0 - hashJoin Stream function Error

2017-06-22 Thread Susheel Kumar
ind any match but then it should not skew up the results of var a. When data matches during join then its fine but otherwise I am running into this issue and whole next expressions doesn't get evaluated due to this... { "result-set": { "docs": [ { "EXCEPTION": "Index: 0, Size: 0", "EOF": true, "RESPONSE_TIME": 44 } ] } }

Re: Error after moving index

2017-06-22 Thread Michael Kuhlmann
llo, > > > > I created an index on my local machine (Windows 10) and it works fine there. > > After uploading the index to the production server (Linux), the server shows > an error: .

Error after moving index

2017-06-22 Thread Moritz Munte
Hello, I created an index on my local machine (Windows 10) and it works fine there. After uploading the index to the production server (Linux), the server shows an error: java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: Unable to create core [contentselect_v3

Re: Solr 6: how to get SortedSetDocValues from index by field name

2017-06-20 Thread SOLR4189
Hi, Tomas. It helped. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-6-how-to-get-SortedSetDocValues-from-index-by-field-name-tp4340388p4342002.html Sent from the Solr - User mailing list archive at Nabble.com.

can you fix this index?

2017-06-20 Thread Zhang, Ziqi
I am running a program that crawls the web and saves data into a solr index. for mysterious reasons, the solr server crashed. And now I end up with a corrupted index that has no segment files and hence risking losing all my data collected for 5 days The error message reads as below when

Re: Solr 6: how to get SortedSetDocValues from index by field name

2017-06-15 Thread Chris Hostetter
t;X" so that we can understand the full issue. Perhaps the best solution doesn't involve "Y" at all? See Also: http://www.perlmonks.org/index.pl?node_id=542341 : How do I get SortedSetDocValues from index by field name? : : I try it and it works for me but I didn't understand why

Re: Solr 6: how to get SortedSetDocValues from index by field name

2017-06-14 Thread Tomas Fernandez Lobbe
Hi, To respond your first question: “How do I get SortedSetDocValues from index by field name?”, DocValues.getSortedSet(LeafReader reader, String field) (which is what you want to use to assert the existence and type of the DV) will give you the dv instance for a single leaf reader. In general

Solr 6: how to get SortedSetDocValues from index by field name

2017-06-13 Thread SOLR4189
How do I get SortedSetDocValues from index by field name? I try it and it works for me but I didn't understand why to use leaves.get(0)? What does it mean? (I saw such using in TestUninvertedReader.java of SOLR-6.5.1): *Map<String, UninvertingReader.Type> mapping = new HashMap<>();

Re: Sharding vs single index vs separate collection

2017-06-08 Thread Susheel Kumar
routing. Can you give some no# how many will utilise routing > vs not routing? > > In general, we should try to serve all the queries with one > index/collection which can be shared if needed or replicated to serve huge > amount of queries. Having a separate index should be avoided un

Re: Sharding vs single index vs separate collection

2017-06-08 Thread Susheel Kumar
with one index/collection which can be shared if needed or replicated to serve huge amount of queries. Having a separate index should be avoided unless you have very good reasons. Thnx On Thu, Jun 8, 2017 at 5:45 PM, Johannes Knaus <kn...@mpdl.mpg.de> wrote: > Hi, > I have a solr

Sharding vs single index vs separate collection

2017-06-08 Thread Johannes Knaus
Hi, I have a solr cloud setup, with document routing (implicit routing with router field). As the index is about documents with a publication date, I routed according the publication year, as in my case, most of the search queries will have a year specified. Now, what would be the best

Re: Re-Index is not working

2017-06-08 Thread Erick Erickson
OK - Contractor > Sent: Thursday, June 08, 2017 10:12 AM > To: 'solr-user@lucene.apache.org' > Subject: RE: Re-Index is not working > > Sorry I did not give enough information. > > "doesn't work" does mean that the documents are not getting indexed. I am > using

RE: Re-Index is not working

2017-06-08 Thread Miller, William K - Norman, OK - Contractor
-Original Message- From: Miller, William K - Norman, OK - Contractor Sent: Thursday, June 08, 2017 10:12 AM To: 'solr-user@lucene.apache.org' Subject: RE: Re-Index is not working Sorry I did not give enough information. "doesn't work" does mean that the documents are not getting inde

RE: Re-Index is not working

2017-06-08 Thread Miller, William K - Norman, OK - Contractor
Sorry I did not give enough information. "doesn't work" does mean that the documents are not getting indexed. I am using a full import. I did discover that if I used the Linux touch command that the document would re-index. I don't have any of the logs as I have been a

Re: Re-Index is not working

2017-06-07 Thread Erick Erickson
K - Norman, OK - Contractor <william.k.mil...@usps.gov.invalid> wrote: > Hello, I am new to this mailing list and I am having a problem with > re-indexing. I will run an index on an xml file using the > DataImportHandler and it will index the file. Then I delete the index > usin

Re-Index is not working

2017-06-07 Thread Miller, William K - Norman, OK - Contractor
Hello, I am new to this mailing list and I am having a problem with re-indexing. I will run an index on an xml file using the DataImportHandler and it will index the file. Then I delete the index using the *:*, , and commands. Then I attempt to re-index the same file with the same

Re: Different DateTime format in dataimport and index

2017-06-06 Thread SOLR4189
I don't use DB. I do dataimport from one collection of SOLR to another collection with the same configuration. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-DateTime-format-in-dataimport-and-index-tp4339230p4339244.html Sent from the Solr - User mailing list

Re: Different DateTime format in dataimport and index

2017-06-06 Thread Erick Erickson
st1, price: 100, name: pizza, pickupTime: 2017-06-06T19:00:00}* > and in reindex or dataimport I see in log: > *{id: test1, price: 100.0, name: pizza, pickupTime: Tue Jun 6 19:00:00 IDT > 2017}* > > Why do float and date have different format in index and dataimport? Is it > SOLR

Different DateTime format in dataimport and index

2017-06-06 Thread SOLR4189
question is why in indexing of item I see in log: *{id: test1, price: 100, name: pizza, pickupTime: 2017-06-06T19:00:00}* and in reindex or dataimport I see in log: *{id: test1, price: 100.0, name: pizza, pickupTime: Tue Jun 6 19:00:00 IDT 2017}* Why do float and date have different format in index

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-06-05 Thread Allison, Timothy B.
AM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Great Tim. What do I need to do to integrate it on my current installation? On May 31, 2017 16:24, "Allison, Timothy B." <talli...@mitre.org> wrote: Apache Tika 1.15 is now available. -

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-06-03 Thread Gytis Mikuciunas
y, May 9, 2017 7:45 AM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Probably better to ask on the Tika list. We'll push the release asap after PDFBox 2.0.6 is out. Andreas plans to cut the release candidate for PDFBox this Friday. Tika will probably have

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-05-31 Thread Allison, Timothy B.
Apache Tika 1.15 is now available. -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Tuesday, May 9, 2017 7:45 AM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Probably better to ask on the Tika list. We'll push

Re: Solr error: org.apache.solr.common.SolrException: Exception writing document id files_21122 to the index; possible analysis error.

2017-05-31 Thread Lars Müller
com>: Lars, More info is needed! Were you able to index _any_ documents before this happened? Are you POSTing via curl or something else? What is your config? Did you change your config just before his? Is the error repeatable? Any idea why the IndexWriter would be closed? &

Re: Solr error: org.apache.solr.common.SolrException: Exception writing document id files_21122 to the index; possible analysis error.

2017-05-31 Thread Rick Leir
Lars, More info is needed! Were you able to index _any_ documents before this happened? Are you POSTing via curl or something else? What is your config? Did you change your config just before his? Is the error repeatable? Any idea why the IndexWriter would be closed? "C

Solr error: org.apache.solr.common.SolrException: Exception writing document id files_21122 to the index; possible analysis error.

2017-05-31 Thread Lars Müller
Hello, I installed Solr 6.5.1 on Ubuntu. Using it with Nextcloud 12. I get this error Message: ERROR true RequestHandlerBase org.apache.solr.common.SolrException: Exception writing document id files_21122 to the index; possible analysis error

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Shawn Heisey
his suggests that you are using some swap for > java pages. But I am not at the bash prompt, you are! The virtual size in "top" includes all the index data that the Solr install is serving, as well as all of the actual RAM that's used. I have one server where VIRT for Solr is over 700 g

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Rick Leir
e inquisitiveness now more than anything. >>> >>> http://web.lavoco.com/top.png >>> >>> (forgot to mention mariadb on there too :) >>> >>> >>> >>> On 26/05/17 16:20, Shawn Heisey wrote: >>>> On 5/26/2017 11:01 AM, Rober

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
t;r...@lavoco.com> wrote: Thanks Shawn, It's more inquisitiveness now more than anything. http://web.lavoco.com/top.png (forgot to mention mariadb on there too :) On 26/05/17 16:20, Shawn Heisey wrote: On 5/26/2017 11:01 AM, Robert Brown wrote: Let's assume I can't get more RAM - why would an

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Rick Leir
> >(forgot to mention mariadb on there too :) > > > >On 26/05/17 16:20, Shawn Heisey wrote: >> On 5/26/2017 11:01 AM, Robert Brown wrote: >>> Let's assume I can't get more RAM - why would an index of no more >than >>> 1MB (on disk) need so much? >

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
Thanks Shawn, It's more inquisitiveness now more than anything. http://web.lavoco.com/top.png (forgot to mention mariadb on there too :) On 26/05/17 16:20, Shawn Heisey wrote: On 5/26/2017 11:01 AM, Robert Brown wrote: Let's assume I can't get more RAM - why would an index of no more

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Shawn Heisey
On 5/26/2017 11:01 AM, Robert Brown wrote: > Let's assume I can't get more RAM - why would an index of no more than > 1MB (on disk) need so much? > > (without getting into why I'm using Solr on such a small index in the > first place :) > > My docs consist of 3 text fiel

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
Let's assume I can't get more RAM - why would an index of no more than 1MB (on disk) need so much? (without getting into why I'm using Solr on such a small index in the first place :) My docs consist of 3 text fields for searching, all others are strings/ints for facets and filtering

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Erick Erickson
ose 2 are both restarted upon a deploy, which is what knocks Solr down. > > I'll experiment with different heap values, with a 1MB index (on disk) I > should be able to get it fairly low. > > I have the same occasional problem on my dev box, which only has 1GB RAM - > quite surpri

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Robert Brown
things on the box are nginx and my Perl web-app. Those 2 are both restarted upon a deploy, which is what knocks Solr down. I'll experiment with different heap values, with a 1MB index (on disk) I should be able to get it fairly low. I have the same occasional problem on my dev box, which only has

Re: (Tiny Index) Solr dies but not OOM

2017-05-26 Thread Rick Leir
to reduce the voluminous output. You could upgrade your hardware cheaply at a surplus store (almost every machine in my office is surplus .. think .. actually, every one). cheers -- Rick On 2017-05-25 06:55 PM, Robert Brown wrote: Hi, I'm currently running 6.5.1 with a tiny index, less than

(Tiny Index) Solr dies but not OOM

2017-05-25 Thread Robert Brown
Hi, I'm currently running 6.5.1 with a tiny index, less than 1MB. When I restart another app on the same server as Solr, Solr occasionally dies, but no solr_oom_killer.log file. Heap size is 256MB (~30MB used), Physical RAM 2GB, typically using 1.5GB. How else can I debug what's causing

Re: SOLR Index and Schema.xml file corruption

2017-05-23 Thread Erick Erickson
If you have classic schema factory configured, then Solr will not write the schema.xml file out. So either something's strange with SiteCore or someone inadvertently hand-edited the schema. I suggest contacting the SiteCore people to see how it would get that way. You should be able to shut

SOLR Index and Schema.xml file corruption

2017-05-23 Thread LAD, SAGAR
Hi SOLR team, We are using SOLR 4.6.0 with sitecore CMS 7.2 . It is observed that search indexes and some time schema.xml file get corrupted. Schema.xml field tag got extra forward slash and it result into stopping of SOLR. We have " " therefore only manual update is allowed. Please guide us

Re: LukeRequestHandler not returning all fields in the index

2017-05-22 Thread Yago Riveiro
in a programatic simple way :/ Thanks for the answer Erick. - Best regards /Yago -- View this message in context: http://lucene.472066.n3.nabble.com/LukeRequestHandler-not-returning-all-fields-in-the-index-tp4336287p4336332.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LukeRequestHandler not returning all fields in the index

2017-05-22 Thread Erick Erickson
Luke really doesn't operate at a level that knows about collections and the like, see: https://issues.apache.org/jira/browse/SOLR-8127. So far there hasn't been much interest in extending it to the collection level particularly because it's intended to get you low-level index characteristics

LukeRequestHandler not returning all fields in the index

2017-05-22 Thread Yago Riveiro
72066.n3.nabble.com/LukeRequestHandler-not-returning-all-fields-in-the-index-tp4336287.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Solr Index issue on string type while querying

2017-05-16 Thread Matt Kuiper
bhan V [mailto:padmanabhan.venkitachalapa...@gmail.com] Sent: Tuesday, May 16, 2017 9:33 AM To: solr-user@lucene.apache.org Subject: Solr Index issue on string type while querying Hello Solr Geeks, Am looking for some helping hands to proceed on an issue am facing now. Here given below one recor

Solr Index issue on string type while querying

2017-05-16 Thread Padmanabhan V
Hello Solr Geeks, Am looking for some helping hands to proceed on an issue am facing now. Here given below one record from the prepared index. i could query the fields without greater than symbol. but when i did query for widthSquareTube_string_mv & heightSquareTube_string_mv. It is not retur

Re: SolrSpellChecker returning suggestions for words present in index

2017-05-11 Thread aruninfo100
Hi Alessandro, I tried the suggestions on the parameters you have specified and is working fine now.Thanks. Thanks and Reagrds, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/SolrSpellChecker-returning-suggestions-for-words-present-in-index-tp4334554p4334756.html

Re: Recommended index-size per core

2017-05-11 Thread Erick Erickson
docs. So it's really impossible to answer "for an index with on-disk size X, how much memory do I need?" I've seen the stored data be a very significant portion of the on-disk size. Best, Erick On Thu, May 11, 2017 at 5:24 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 5/1

Re: Recommended index-size per core

2017-05-11 Thread Shawn Heisey
On 5/11/2017 4:59 PM, S G wrote: > How can 50GB index be handled by a 10GB heap? > I am a developer myself and would love to know as many details as possible. > So a long answer would be much appreciated. Lucene (which is what provides large pieces of Solr's functionality) does

Re: Recommended index-size per core

2017-05-11 Thread S G
Thanks Toke. Your answer did help me a lot. But one part about your answer is something that has always been confusing to be me. > The JVM heap is not used for caching the index data directly (although it holds derived data). What you need is free memory on your machine for OS disk-cach

Re: Recommended index-size per core

2017-05-11 Thread Shawn Heisey
On 5/10/2017 11:52 AM, S G wrote: > Is there a recommendation on the size of index that one should host > per core? No, there really isn't. I can list off a bunch of recommendations, but a whole bunch of things that I don't know about your install could make those recommendations comp

Re: Recommended index-size per core

2017-05-11 Thread David Hastings
l.com> wrote: > > *Rough estimates for an initial size:* > > > > 50gb index is best served if all of it is in memory. > > Assuming you need low latency and/or high throughput, yes. I mention this > because in many cases the requirements for number of simultaneous user

Re: SolrSpellChecker returning suggestions for words present in index

2017-05-11 Thread alessandro.benedetti
suggestions even when the given query term is present in the index and considered "correct".* 2 *Specify the number of suggestions to return for each query term existing in the index and/or dictionary. Presumably, users will want fewer suggestions for words with docFrequency>0. Also setting thi

Re: Recommended index-size per core

2017-05-10 Thread Toke Eskildsen
S G <sg.online.em...@gmail.com> wrote: > *Rough estimates for an initial size:* > > 50gb index is best served if all of it is in memory. Assuming you need low latency and/or high throughput, yes. I mention this because in many cases the requirements for number of sim

Recommended index-size per core

2017-05-10 Thread S G
Hi, Is there a recommendation on the size of index that one should host per core? Idea is to come up with an *initial* shard/replica setting for a load test. And then arrive at a good cluster size based on that testing. *Example: * Num documents: 100 million Average document size: 1kb So total

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-05-09 Thread Allison, Timothy B.
.tar.gz -Original Message- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Tuesday, May 9, 2017 7:17 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files Are there any news regarding Tika 1.15? Maybe it's already ready for download somewhere G

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-05-09 Thread Gytis Mikuciunas
:gyt...@gmail.com] > Sent: Wednesday, April 12, 2017 1:00 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr 6.4. Can't index MS Visio vsdx files > > when 1.15 will be released? maybe you have some beta version and I could > test it :) > > SAX sounds interesting, and f

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
I also opened https://issues.apache.org/jira/browse/SOLR-10532 to fix this annoying and confusing behavior of SuggestComponent. On Thu, Apr 20, 2017 at 8:40 PM, Andrea Gazzarini wrote: > Ah great, many thanks again! > > > > On 20/04/17 17:09, Shalin Shekhar Mangar wrote: >> >>

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini
Ah great, many thanks again! On 20/04/17 17:09, Shalin Shekhar Mangar wrote: Hi Andrea, Looks like I have you some bad information. I looked at the code and ran a test locally. The suggest.build and suggest.reload params are in fact distributed across to all shards but only to one replica of

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
Hi Andrea, Looks like I have you some bad information. I looked at the code and ran a test locally. The suggest.build and suggest.reload params are in fact distributed across to all shards but only to one replica of each shard. This is still bad enough and you should use buildOnOptimize as

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini
Perfect, I don't need NRT at this moment so that fits perfectly Thanks, Andrea On 20/04/17 14:37, Shalin Shekhar Mangar wrote: Yeah, if it is just once a day then you can afford to do an optimize. For a more NRT indexing approach, I wouldn't recommend optimize at all. On Thu, Apr 20, 2017 at

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
Yeah, if it is just once a day then you can afford to do an optimize. For a more NRT indexing approach, I wouldn't recommend optimize at all. On Thu, Apr 20, 2017 at 5:29 PM, Andrea Gazzarini wrote: > Ok, many thanks > > I see / read that it should be better to rely on the

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini
Ok, many thanks I see / read that it should be better to rely on the background merging instead of issuing explicit optimizes, but I think in this case one optimize in a day it shouldn't be a problem. Did I get you correctly? Thanks again, Andrea On 20/04/17 13:17, Shalin Shekhar Mangar

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
On Thu, Apr 20, 2017 at 4:27 PM, Andrea Gazzarini <gxs...@gmail.com> wrote: > Hi Shalin, > many thanks for your response. This is my scenario: > > * I build my index once in a day, it could be a delta or a full >re-index.In any case, that takes some time; > * I h

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Andrea Gazzarini
Hi Shalin, many thanks for your response. This is my scenario: * I build my index once in a day, it could be a delta or a full re-index.In any case, that takes some time; * I have an auto-commit (hard, no soft-commits) set to a given period and during the indexing cycle, several hard

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-20 Thread Shalin Shekhar Mangar
is working? Can I issue this >> command towards just one node, and have that node forward the >> request to the other nodes (so each of them can build its own >> suggester index portion)? The suggest.build only builds locally in the node to which you sent the requ

Re: Index and query time suggester behavior in a SolrCloud environment

2017-04-19 Thread Andrea Gazzarini
the *suggest.build* command is working? Can I issue this command towards just one node, and have that node forward the request to the other nodes (so each of them can build its own suggester index portion)? * how things are working at query time? Can I use send a request with only

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-18 Thread Shawn Heisey
en the previous commit and that commit will not be visible *on the master* until a later commit happens and IS able to open a new searcher. What happens on the slaves may be a little bit different, because commits normally only happen on the slave when a changed index is replicated from the ma

Index and query time suggester behavior in a SolrCloud environment

2017-04-18 Thread Andrea Gazzarini
to the other nodes (so each of them can build its own suggester index portion)? * how things are working at query time? Can I use send a request with only suggest.q=... to my /suggest request handler and get back distributed suggestions? Thanks in advance Andrea

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-12 Thread Allison, Timothy B.
Message- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Wednesday, April 12, 2017 1:00 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS Visio vsdx files when 1.15 will be released? maybe you have some beta version and I could test it :) SAX sounds interesting

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
t something needs to be fixed with > the parser. In short, the solution will come from POI. > > Best, > > Tim > > -Original Message- > From: Gytis Mikuciunas [mailto:gyt...@gmail.com] > Sent: Tuesday, April 11, 2017 1:56 PM > To: solr-user@lucene.a

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Allison, Timothy B.
: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Tuesday, April 11, 2017 1:56 PM To: solr-user@lucene.apache.org Subject: RE: Solr 6.4. Can't index MS Visio vsdx files Thanks for your responses. Are there any posibilities to ignore parsing errors and continue indexing? because now solr/tik

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
Thanks for your responses. Are there any posibilities to ignore parsing errors and continue indexing? because now solr/tika stops parsing whole document if it finds any exception On Apr 11, 2017 19:51, "Allison, Timothy B." wrote: > You might want to drop a note to the dev

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Allison, Timothy B.
You might want to drop a note to the dev or user's list on Apache POI. I'm not extremely familiar with the vsd(x) portion of our code base. The first item ("PolylineTo") may be caused by a mismatch btwn your doc and the ooxml spec. The second item appears to be an unsupported feature. The

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
common.SolrException", "root-error-class", "java.lang.ArrayIndexOutOfBoundsException" ] } } Regards, Gytis On Mon, Feb 6, 2017 at 6:54 PM, Allison, Timothy B. <talli...@mitre.org> wrote: > Shouldn't have taken you that much effort.

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-11 Thread Toke Eskildsen
asionally, it is probably not a problem. If you see it often, that means that you are re-opening at a high rate, relative to the time it takes for a searcher to be ready. Since each searcher holds a lock on the files it searches, and you have multiple concurrent open searchers on a volatile index, that help

Unable to index UIMA field into Solr

2017-04-11 Thread aruninfo100
Hi All, I am trying to integrate UIMA with Solr.I was able to do the same.But some of the UIMA fields are not getting indexed into solr whereas other *fields like pos,ChukType are getting indexed*. I am using openNLP-UIMA together for text analysis. When I tried to index the UIMA field

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-10 Thread kshitij tyagi
in schema and index only those fields upon which you are querying and not index all the fields. 3. Check you segment count configuration in solrconfig.xml, it should not be too high or too low as it will affect indexing speed, a high number would give good indexing speed but a low search result

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-10 Thread Himanshu Sachdeva
only the slaves? What purpose do the searchers serve exactly? Your time and guidance will be very much appreciated. Thank you. On Thu, Apr 6, 2017 at 6:12 PM, Toke Eskildsen <t...@kb.dk> wrote: > On Thu, 2017-04-06 at 16:30 +0530, Himanshu Sachdeva wrote: > > We monitored the index

Re: Problems creating index for suggestions

2017-04-07 Thread Alexis Aravena Silva
solr-user@lucene.apache.org Subject: Re: Problems creating index for suggestions Hi Alexis, this is not a reason for the 20Gb overhead, but for sure you are using ina wrong way the suggester component. You don't want the analysis chain to produce edge ngrams and then build the FST out of those to

Re: Problems creating index for suggestions

2017-04-07 Thread alessandro.benedetti
Ltd. - www.sease.io -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-creating-index-for-suggestions-tp4328392p4328914.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-06 Thread Toke Eskildsen
On Thu, 2017-04-06 at 16:30 +0530, Himanshu Sachdeva wrote: > We monitored the index size for a few days and found that it varies > widely from 11GB to 43GB.  Lucene/Solr indexes consists of segments, each holding a number of documents. When a document is deleted, its bytes are not r

Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-06 Thread Himanshu Sachdeva
red 10 slaves for handling the reads from website. Slaves poll master at an interval of 20 minutes. We monitored the index size for a few days and found that it varies widely from 11GB to 43GB. ​ Recently, we started getting a lot of out of memory errors on the master. Everytime, solr beco

Re: Problems creating index for suggestions

2017-04-05 Thread Alexis Aravena Silva
Hi Erick, numDocs and MaxDocs = 8. This is the content of the field _sugerencia_: [cid:e03430ab-ff19-4955-a6da-d50b38e89b3d] I've noticed that the problem is when Solr builds the fuzzySuggester index, in this type of suggestion, the temp file grow greatly and when the process finish

Problems creating index for suggestions

2017-04-04 Thread Alexis Aravena Silva
Hi, I'm creating an index for suggestions, when I rebuild the index with 8 documents, Solr creates a temp file that consumes over 20GB in the process and It takes more than 10 minutes in reindex, what is the problem?, It's illogic that Solr takes so long and consumes such size of my disk

Re: Index upgrade time and disk space

2017-04-02 Thread sputul
Thanks, Shawn for getting back with detail explanation. I will run tests upfront with large index and space, and see if fast disk is needed. - Putul -- View this message in context: http://lucene.472066.n3.nabble.com/Index-upgrade-time-and-disk-space-tp4328003p4328040.html Sent from the Solr

Re: Index upgrade time and disk space

2017-04-02 Thread Shawn Heisey
On 4/2/2017 8:16 AM, Putul S wrote: > I am migrating Solr 4 index to Solr 5. The upgrade tool/script works well. > But ran out disk space upgrading 4 GB index. The server had at least 8 GB > free then. On production, the index is about 200 GB. > > How much disk space is need

Index upgrade time and disk space

2017-04-02 Thread Putul S
Hi all, I am migrating Solr 4 index to Solr 5. The upgrade tool/script works well. But ran out disk space upgrading 4 GB index. The server had at least 8 GB free then. On production, the index is about 200 GB. How much disk space is needed for indexing? Also, how long does it take to upgrade

RE: Index scanned documents

2017-03-27 Thread Allison, Timothy B.
, March 27, 2017 11:48 AM To: solr-user@lucene.apache.org Subject: Re: Index scanned documents I tried this solution from Tim Allison, and it works. http://stackoverflow.com/questions/32354209/apache-tika-extract-scanned-pdf-files Regards, Edwin On 27 March 2017 at 20:07, Allison, Timothy

Re: Index scanned documents

2017-03-27 Thread Zheng Lin Edwin Yeo
> -Original Message- > From: Arian Pasquali [mailto:arianpasqu...@gmail.com] > Sent: Sunday, March 26, 2017 11:44 AM > To: solr-user@lucene.apache.org > Subject: Re: Index scanned documents > > Hi Walled, > > I've never done that with solr, but you would probably need to u

RE: Index scanned documents

2017-03-27 Thread Allison, Timothy B.
-Original Message- From: Arian Pasquali [mailto:arianpasqu...@gmail.com] Sent: Sunday, March 26, 2017 11:44 AM To: solr-user@lucene.apache.org Subject: Re: Index scanned documents Hi Walled, I've never done that with solr, but you would probably need to use some OCR preprocessing before indexing

RE: Index scanned documents

2017-03-26 Thread Phil Scadden
While building directly into Solr might be appealing, I would argue that it is best to use OCR software first, outside of SOLR, to convert the PDF into "searchable" PDF format. That way when the document is retrieved, it is a lot more useful to the searcher - making it easy to find the text

Re: Index scanned documents

2017-03-26 Thread Arian Pasquali
Hi Walled, I've never done that with solr, but you would probably need to use some OCR preprocessing before indexing. The most popular library I know for the job is tesseract-orc . If you want to do that inside solr I've found that Tika has some support for that

Re: Index scanned documents

2017-03-26 Thread Zheng Lin Edwin Yeo
I'm also working on this issue right now, to extract the text in the scanned image in PDF files. >From what I know, we can use Tesseract OCR to extract the text in the image through Apache Tika, and it will come together with the Solr. By the way, which Solr version are you using? Regards,

Index scanned documents

2017-03-26 Thread Waleed Raza
Hello I want to ask you that how can we extract in solr text from images which are inside pdf and MS office documents ? i found many websites but did not get a reply of it please guide me.

Re: Index scanned documents

2017-03-26 Thread Waleed Raza
Hello I want to ask you that how can we extract text in solr from images which are inside pdf and MS office documents ? i found many websites but did not get a reply of it please guide me. On Sun, Mar 26, 2017 at 2:57 PM, Waleed Raza wrote: > Hello > I want to

Re: Storing index of different collections in different location

2017-03-20 Thread Zheng Lin Edwin Yeo
i.apache.org/confluence/display/solr/Defining+core.properties > mentions > > dataDir > > The core's data directory (where indexes are stored) as either an absolute > pathname, or a path relative to the value of instanceDir. This is data by > default. > Probably you can adjust index

Re: Storing index of different collections in different location

2017-03-19 Thread Mikhail Khludnev
nceDir. This is data by default. Probably you can adjust index location if you create shards manually. On Sun, Mar 19, 2017 at 5:46 PM, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > Hi, > > Is it possible to store the index of different collections of the same > shard under dif

Storing index of different collections in different location

2017-03-19 Thread Zheng Lin Edwin Yeo
Hi, Is it possible to store the index of different collections of the same shard under different directory or even different hard disk? For example, I want to store the index of collection1 in C drive, and the index of collection2 in D drive. I'm using SolrCloud in Solr 6.4.2 Regards, Edwin

Re: Finding time of last commit to index from SolrJ?

2017-03-16 Thread Damien Kamerman
I ended up doing something like this: String core = "collection1_shard1_core1"; ModifiableSolrParams p = new ModifiableSolrParams(); p.set("show", "index"); GenericSolrRequest checkRequest = new GenericSolrRequest(POST, "/../" + core + "/admin/luke

sum multivalued field index with banana

2017-03-16 Thread tkg_cangkul
hi sorry if this a little bit out ouf topic, i've just started to using banana dashboard. and i want to do summarize proccess from data that indexed in solr can i do sum proccess with banana dashboard when i have some multivalued data index on my field? this is my sample data on solr

Re: Index corruption with replication

2017-03-16 Thread santosh sidnal
;/app/IBM/WebSphere/CommerceServer70/instances/RBUATLV/search/solr/home/MC_10001/fr_FR/CatalogEntry/data/index/_5a.fdt")) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:236) at org.apache.lucene.index.SegmentReade

<    5   6   7   8   9   10   11   12   13   14   >