Have you done what the message says and looked at your Solr log? If so,
what information is there?
> On Dec 23, 2020, at 5:13 AM, DINSD | SPAutores
> wrote:
>
> Hi,
>
> I'm trying to install the package "data-import-handler", since it was
> discontinued from core SolR distro.
>
> https://git
On 12/18/2020 12:03 AM, basel altameme wrote:
While trying to Import & Index data from MySQL DB custom view i am facing the
error below:
Data Config problem: The value of attribute "query" associated with an element type
"entity" must not contain the '<' character.
Please note that in my SQL st
Have you tried escaping that character?
> On Dec 18, 2020, at 2:03 AM, basel altameme
> wrote:
>
> Dear,
> While trying to Import & Index data from MySQL DB custom view i am facing the
> error below:
> Data Config problem: The value of attribute "query" associated with an
> element type "enti
On 11/30/2020 7:50 AM, David Smiley wrote:
Yes, absolutely to what Eric said. We goofed on news / release highlights
on how to communicate what's happening in Solr. From a Solr insider point
of view, we are "deprecating" because strictly speaking, the code isn't in
our codebase any longer. Fro
Yes, absolutely to what Eric said. We goofed on news / release highlights
on how to communicate what's happening in Solr. From a Solr insider point
of view, we are "deprecating" because strictly speaking, the code isn't in
our codebase any longer. From a user point of view (the audience of news
You don’t need to abandon DIH right now…. You can just use the Github hosted
version…. The more people who use it, the better a community it will form
around it!It’s a bit chicken and egg, since no one is actively discussing
it, submitting PR’s etc, it may languish. If you use it, and
On 11/29/2020 10:32 AM, Erick Erickson wrote:
And I absolutely agree with Walter that the DB is often where
the bottleneck lies. You might be able to
use multiple threads and/or processes to query the
DB if that’s the case and you can find some kind of partition
key.
IME the difficult part has
If you like Java instead of Python, here’s a skeletal program:
https://lucidworks.com/post/indexing-with-solrj/
It’s simple and single-threaded, but could serve as a basis for
something along the lines that Walter suggests.
And I absolutely agree with Walter that the DB is often where
the bottle
I recommend building an outboard loader, like I did a dozen years ago for
Solr 1.3 (before DIH) and did again recently. I’m glad to send you my Python
program, though it reads from a JSONL file, not a database.
Run a loop fetching records from a database. Put each record into a synchronized
(threa
I went through the same stages of grief that you are about to start
but (luckily?) my core dataset grew some weird cousins and we ended up
writing our own indexer to join them all together/do partial
updates/other stuff beyond DIH. It's not difficult to upload docs but
is definitely slower so far.
On 11/28/2020 5:48 PM, matthew sporleder wrote:
... The bottom of
that github page isn't hopeful however :)
Yeah, "works with MariaDB" is a particularly bad way of saying "BYO JDBC
JAR" :)
It's a more general queston though, what is the path forward for users
who with data in two places?
https://solr.cool/#utilities -> https://github.com/rohitbemax/dataimporthandler
You can import it in the many new/novel ways to add things to a solr
install and it should work like always (apparently). The bottom of
that github page isn't hopeful however :)
On Sat, Nov 28, 2020 at 5:21 PM Dmitri
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219
On Tue, May 5, 2020 at 1:58 PM Mikhail Khludnev wrote:
>
> Hello, James.
>
> DataImportHandler has a lock preventing concurrent exe
Hello, James.
DataImportHandler has a lock preventing concurrent execution. If you need
to run several imports in parallel at the same core, you need to duplicate
"/dataimport" handlers definition in solrconfig.xml. Thus, you can run them
in parallel. Regarding schema, I prefer the latter but mile
Data, IM & Analytics
>
>
>
> Lautrupparken 40-42, DK-2750 Ballerup
> E-mail m...@kmd.dk Web www.kmd.dk
> Mobil +4525571418
>
> -Oprindelig meddelelse-
> Fra: Alexandre Rafalovitch
> Sendt: 2. oktober 2018 18:18
> Til: solr-user
> Emne: Re: data-imp
Admin UI for DIH will show you the config file read. So, if nothing is
there, the path is most likely the issue
You can also provide or update the configuration right in UI if you
enable debug.
Finally, the config file is reread on every invocation, so you don't
need to restart the core after cha
> url="C:/Users/z6mhq/Desktop/data_import/nh_test.xml"
Have you tried url="C:\\Users\\z6mhq/Desktop\\data_import\\nh_test.xml" ?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 2. okt. 2018 kl. 17:15 skrev Martin Frank Hansen (MHQ) :
>
> Hi,
>
> I am having some pr
Hi Thomas,
Is this SolrCloud or Solr master-slave? Do you update index while indexing? Did
you check if all your instances behind LB are in sync if you are using
master-slave?
My guess would be that DIH is using cursors to read data from another Solr. If
you are using multiple Solr instances beh
Thank you both for the responses. I was able to get the import working
through telnet, and I'll see if I can get the post utility working as that
seems like a better option.
Thanks,
Adam
On Mon, Aug 20, 2018, 2:04 PM Alexandre Rafalovitch
wrote:
> Admin UI just hits Solr for a particular URL wi
Admin UI just hits Solr for a particular URL with specific parameters.
You could totally call it from the command line, but it _would_ need
to be an HTTP client of some sort. You could encode all of the
parameters into the DIH (or a new) handler, it is all defined in
solrconfig.xml (/dataimport is
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Adam,
On 8/20/18 1:45 PM, Adam Blank wrote:
> I'm running Solr 5.5.0 on AIX, and I'm wondering if there's a way
> to import the index from the command line instead of using the
> admin console? I don't have the ability to use a HTTP client such
> a
On 4/16/2018 7:32 PM, gadelkareem wrote:
I cannot complain cuz it actually worked well for me so far but..
I still do not understand if Solr already paginates the results from the
full import, why not do the same for the delta. It is almost the same query:
`select id from t where t.lastmod > ${s
Thanks Shawn.
I cannot complain cuz it actually worked well for me so far but..
I still do not understand if Solr already paginates the results from the
full import, why not do the same for the delta. It is almost the same query:
`select id from t where t.lastmod > ${solrTime}`
`select * from t w
On 4/5/2018 7:31 PM, gadelkareem wrote:
Why the deltaImportQuery uses "where id='${dataimporter.id}'" instead of
something like where id IN ('${dataimporter.id})'
Because there's only one value for that property.
If the deltaQuery returns a million rows, then deltaImportQuery is going
to be e
I just tried putting the solr-dataimporthandler-6.6.0.jar in server/solr/lib
and I got past the problem. I still don't understand why not found in /dist
-Original Message-
From: Steve Pruitt [mailto:bpru...@opentext.com]
Sent: Thursday, August 31, 2017 3:05 PM
To: solr-user@lucene.apach
If Solr is down, then adding through SolrJ would fail as well. Kafka's new
API has some great features for this sort of thing. The new client API is
designed to be run in a long-running loop where you poll for new messages
with a certain amount of defined timeout (ex: consumer.poll(1000) for 1s)
So
Are Kafka and SQS interchangeable? (The latter does not seem to be free.)
@Wunder:
I'm assuming, that updating to Solr would fail if Solr is unavailable not
just if posting via say a DB trigger, but probably also if trying to post
through SolrJ? (Which is what I'm using for now.) So, even if us
@lucene.apache.org
Subject: Re: Data Import
Hi Daphne,
Are you using DSE?
Thanks & Regards,
Vishal
On Fri, Mar 17, 2017 at 7:40 PM, Liu, Daphne
wrote:
> I just want to share my recent project. I have successfully sent all
> our EDI documents to Cassandra 3.7 clusters using Solr 6.3 Data Imp
Streaming the data through kafka would be a good option if near real time
data indexing is the key requirement.
In our application the RDBMS data is populated by an ETL job periodically
so we don't need real time data indexing for now.
Cheers,
Vishal
On Fri, Mar 17, 2017 at 10:30 PM, Erick Ericks
That fails if Solr is not available.
To avoid dropping updates, you need some kind of persistent queue. We use
Amazon SQS for our incremental updates.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 17, 2017, at 10:09 AM, OTH wrote:
>
> Could
>
> CEVA Logistics / 10751 Deerwood Park Blvd, Suite 200, Jacksonville, FL
> 32256 USA / www.cevalogistics.com T 904.564.1192 / F 904.928.1448 /
> daphne@cevalogistics.com
>
>
>
> -Original Message-
> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
>
Could the database trigger not just post the change to solr?
On Fri, Mar 17, 2017 at 10:00 PM, Erick Erickson
wrote:
> Or set a trigger on your RDBMS's main table to put the relevant
> information in a different table (call it EVENTS) and have your SolrJ
> consult the EVENTS table periodically.
Or set a trigger on your RDBMS's main table to put the relevant
information in a different table (call it EVENTS) and have your SolrJ
consult the EVENTS table periodically. Essentially you're using the
EVENTS table as a queue where the trigger is the producer and the
SolrJ program is the consumer.
Thanks to all of you for the valuable inputs.
Being on J2ee platform I also felt using solrJ in a multi threaded
environment would be a better choice to index RDBMS data into SolrCloud.
I will try with a scheduler triggered micro service to do the job using
SolrJ.
Regards,
Vishal
On Fri, Mar 17,
One assumes by hooking into the same code that updates RDBMS, as
opposed to be reverse engineering the changes from looking at the DB
content. This would be especially the case for Delete changes.
Regards,
Alex.
http://www.solr-start.com/ - Resources for Solr users, new and experienced
O
>
> Also, solrj is good when you want your RDBMS updates make immediately
> available in solr.
How can SolrJ be used to make RDBMS updates immediately available?
Thanks
On Fri, Mar 17, 2017 at 2:28 PM, Sujay Bawaskar
wrote:
> Hi Vishal,
>
> As per my experience DIH is the best for RDBMS to solr
s.com
-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Friday, March 17, 2017 9:54 AM
To: solr-user
Subject: Re: Data Import
I feel DIH is much better for prototyping, even though people do use it in
production. If you do want to use DIH, you may benefit
I feel DIH is much better for prototyping, even though people do use
it in production. If you do want to use DIH, you may benefit from
reviewing the DIH-DB example I am currently rewriting in
https://issues.apache.org/jira/browse/SOLR-10312 (may need to change
luceneMatchVersion in solrconfig.xml f
On 3/17/2017 3:04 AM, vishal jain wrote:
> I am new to Solr and am trying to move data from my RDBMS to Solr. I know the
> available options are:
> 1) Post Tool
> 2) DIH
> 3) SolrJ (as ours is a J2EE application).
>
> I want to know what is the recommended way for Data import in production
> envir
Hi Vishal,
As per my experience DIH is the best for RDBMS to solr index. DIH with
caching has best performance. DIH nested entities allow you to define
simple queries.
Also, solrj is good when you want your RDBMS updates make immediately
available in solr. DIH full import can be used for index all
Also, upgrade to 6.4.2. There are serious performance problems in 6.4.0 and
6.4.1.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 15, 2017, at 12:05 PM, Liu, Daphne
> wrote:
>
> For Solr 6.3, I have to move mine to
> ../solr-6.3.0/server/s
For Solr 6.3, I have to move mine to
../solr-6.3.0/server/solr-webapp/webapp/WEB-INF/lib. If you are using jetty.
Kind regards,
Daphne Liu
BI Architect - Matrix SCM
CEVA Logistics / 10751 Deerwood Park Blvd, Suite 200, Jacksonville, FL 32256
USA / www.cevalogistics.com T 904.564.1192 / F 904.
You could configure the dataimporthandler to not delete at the start
(either do a delta or set the preimportdeltequery), and set a
postimportdeletequery if required.
On Saturday, 4 March 2017, Alexandre Rafalovitch wrote:
> Commit is index global. So if you have overlapping timelines and commit
Commit is index global. So if you have overlapping timelines and commit is
issued, it will affect all changes done to that point.
So, the aliases may be better for you. You could potentially also reload a
cure with changes solrconfig.XML settings, but that's heavy on caches.
Regards,
Alex
On
>
> You have indicated that you have a way to avoid doing updates during the
> full import. Because of this, you do have another option that is likely
> much easier for you to implement: Set the "commitWithin" parameter on
> each update request. This works almost identically to autoSoftCommit,
On 3/3/2017 10:17 AM, Sales wrote:
> I am not sure how best to handle this. We use the data import handle to
> re-sync all our data on a daily basis, takes 1-2 hours depending on system
> load. It is set up to commit at the end, so, the old index remains until it’s
> done, and, we lose no access
> On Mar 3, 2017, at 11:30 AM, Erick Erickson wrote:
>
> One way to handle this (presuming SolrCloud) is collection aliasing.
> You create two collections, c1 and c2. You then have two aliases. when
> you start "index" is aliased to c1 and "search" is aliased to c2. Now
> do your full import to
One way to handle this (presuming SolrCloud) is collection aliasing.
You create two collections, c1 and c2. You then have two aliases. when
you start "index" is aliased to c1 and "search" is aliased to c2. Now
do your full import to "index" (and, BTW, you'd be well advised to do
at least a hard co
>
> On Mar 3, 2017, at 11:22 AM, Alexandre Rafalovitch wrote:
>
> On 3 March 2017 at 12:17, Sales
> wrote:
>> When we enabled those, during the index, the data disappeared since it kept
>> soft committing during the import process,
>
> This part does not quite make sense. Could you expand on
On 3 March 2017 at 12:17, Sales wrote:
> When we enabled those, during the index, the data disappeared since it kept
> soft committing during the import process,
This part does not quite make sense. Could you expand on this "data
disappeared" part to understand what the issue is.
The main issue
On 1/25/2017 4:06 PM, Dan Scarf wrote:
> I upgraded Solr 6.3.0 this morning to 6.4.0. All seemed good according to
> the logs but this afternoon we discovered that the DataImport tabs in our
> Collections now say:
>
> 'Sorry, no dataimport-handler defined!'.
This is a bug that only applies to 6.4
On 12/11/2016 8:00 PM, Brian Narsi wrote:
> We are using Solr 5.1.0 and DIH to build index.
>
> We are using DIH with clean=true and commit=true and optimize=true.
> Currently retrieving about 10.5 million records in about an hour.
>
> I will like to find from other member's experiences as to how l
Am 12.12.2016 um 04:00 schrieb Brian Narsi:
> We are using Solr 5.1.0 and DIH to build index.
>
> We are using DIH with clean=true and commit=true and optimize=true.
> Currently retrieving about 10.5 million records in about an hour.
>
> I will like to find from other member's experiences as to
I ran my jar application beside solr running instance where I want to
trigger a DIH import.
I tried this approach:
String urlString1 = "http://localhost:8983/solr/db/dataimport";;
SolrClient solr1 = new HttpSolrClient.Builder(urlString1).build();
ModifiableSolrParams params = new ModifiableSolrPar
on a quick glance, and not having tried this myself...
this seems wrong. You're setting a URL parameter "db":
params.set("db","/dataimport");
that's equivalent to a URL like
http://localhost:8983/solr&db=/dataimport
you'd want:
http://localhost:8983/solr/db/dataimport?command=full-import
I thin
Actually to be honest I realized that I only needed to trigger a data
import handler from a jar file. Previously this was done in earlier
versions via the SolrServer object. Now I am thinking if this is OK?:
String urlString1 = "http://localhost:8983/solr/";;
SolrClient solr1 = new HttpSolrClient.
I forgot to mention I am creating a jar file beside of a running solr 6.3
instance to which I am hoping to attach with java via the SolrDispatchFilter
to get at the cores and so then I could work with data in code.
2016-11-25 19:31 GMT+01:00 Marek Ščevlík :
> Hi Daniel. Thanks for a reply. I won
Hi Daniel. Thanks for a reply. I wonder is it now still possibly with
release of Solr 6.3 to get hold of a running instance of the jetty server
that is part of the solution? I found some code for previous versions where
it was captured with this code and one could then obtain cores for a
running so
Is your goal to still index into Solr? It was not clear.
If yes, then it has been discussed quite a bit. The challenge is that
DIH is integrated into AdminUI, which makes it easier to see the
progress and set some flags. Plus the required jars are loaded via
solrconfig.xml, just like all other ext
Marek,
I've wanted to do something like this in the past as well. However, a rewrite
that supports the same XML syntax might be better. There are several problems
with the design of the Data Import Handler that make it not quite suitable:
- Not designed for Multi-threading
- Bad implementati
Hello Jonas,
Did you figure this out?
Dr. Chuck Brooks
248-838-5070
-Original Message-
From: Jonas Vasiliauskas [mailto:jonas.vasiliaus...@yahoo.com.INVALID]
Sent: Saturday, July 02, 2016 11:37 AM
To: solr-user@lucene.apache.org
Subject: Data import handler in techproducts example
He
Hi Jonas,
Search for the
solr-dataimporthandler-*.jar place it under a lib directory (same level as the
solr.xml file) along with the mysql jdbc driver (mysql-connector-java-*.jar)
Please see:
https://cwiki.apache.org/confluence/display/solr/Lib+Directives+in+SolrConfig
On Saturday, July 2,
There's nothing saying you have
to highlight fields you search on. So you
can specify hl.fl to be the "normal" (perhaps
stored-only) fields and still search on the
uber-field.
Best,
Erick
On Thu, May 26, 2016 at 2:08 PM, kostali hassan
wrote:
> I did it , I copied all my dynamic field into text
I did it , I copied all my dynamic field into text field and it work great.
just one question even if I copied text into content and the inverse for
get highliting , thats not work ,they are another way to get highliting?
thank you eric
2016-05-26 18:28 GMT+01:00 Erick Erickson :
> And, you can c
And, you can copy all of the fields into an "uber field" using the
copyField directive and just search the "uber field".
Best,
Erick
On Thu, May 26, 2016 at 7:35 AM, kostali hassan
wrote:
> thank you it make sence .
> have a good day
>
> 2016-05-26 15:31 GMT+01:00 Siddhartha Singh Sandhu :
>
>>
thank you it make sence .
have a good day
2016-05-26 15:31 GMT+01:00 Siddhartha Singh Sandhu :
> The schema.xml/managed_schema defines the default search field as `text`.
>
> You can make all fields that you want searchable type `text`.
>
> On Thu, May 26, 2016 at 10:23 AM, kostali hassan <
> med
The schema.xml/managed_schema defines the default search field as `text`.
You can make all fields that you want searchable type `text`.
On Thu, May 26, 2016 at 10:23 AM, kostali hassan
wrote:
> I import data from sql databases with DIH . I am looking for serch term in
> all fields not by field.
It's resolved after changing my column name..its all case sensitive...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Data-Import-Handler-Multivalued-fields-splitBy-tp4243667p4260301.html
Sent from the Solr - User mailing list archive at Nabble.com.
I am also having the same problem.
Have you resolved this issue?
"response": {
"numFound": 3,
"start": 0,
"docs": [
{
"genre": [
"Action|Adventure",
"Action",
"Adventure"
]
},
{
"genre": [
"Drama|Suspens
Hi
Dataimport section in web ui page still shows me that no data import handler
is defined. And no data is being added to my new collection.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Data-Import-Handler-Usage-tp4257518p4257576.html
Sent from the Solr - User mailing li
The "other" collection (destination of the import) is the collection where that
data import handler definition resides.
Erik
> On Feb 16, 2016, at 01:54, vidya wrote:
>
> Hi
>
> I have gone through documents to define data import handler in solr. But i
> couldnot implement it.
> I have cr
You can start with one of the suggestions from this link based on your
indexing and query load.
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
Thanks,
Susheel
On Mon, Feb 8, 2016 at 10:15 AM, Troy Edwards
wrote:
> We are running the
we have this for a collection which updated every 3mins with min of 500
documents and once in a day of 10k documents in start of the day
${solr.autoCommit.maxTime:30}
1
true
true
${solr.autoSoftCommit.maxTime:6000}
As per solr documentation, If
While researching the space on the servers, I found that log files from
Sept 2015 are still there. These are solr_gc_log_datetime and
solr_log_datetime.
Is the default logging for Solr ok for production systems or does it need
to be changed/tuned?
Thanks,
On Tue, Feb 2, 2016 at 2:04 PM, Troy Edw
That is help!
Thank you for the thoughts.
On Tue, Feb 2, 2016 at 12:17 PM, Erick Erickson
wrote:
> Scratch that installation and start over?
>
> Really, it sounds like something is fundamentally messed up with the
> Linux install. Perhaps something as simple as file paths, or you have
> old ja
Scratch that installation and start over?
Really, it sounds like something is fundamentally messed up with the
Linux install. Perhaps something as simple as file paths, or you have
old jars hanging around that are mis-matched. Or someone manually
deleted files from the Solr install. Or your disk f
Rerunning the Data Import Handler again on the the linux machine has
started producing some errors and warnings:
On the node on which DIH was started:
WARN SolrWriter Error creating document : SolrInputDocument
org.apache.solr.common.SolrException: No registered leader was found
after waiting fo
The first thing I'd be looking at is how I the JDBC batch size compares
between the two machines.
AFAIK, Solr shouldn't notice the difference, and since a large majority
of the development is done on Linux-based systems, I'd be surprised if
this was worse than Windows, which would lead me to t
Sorry, I should explain further. The Data Import Handler had been running
for a while retrieving only about 15 records from the database. Both in
development env (windows) and linux machine it took about 3 mins.
The query has been changed and we are now trying to retrieve about 10
million reco
What happens if you run just the SQL query from the
windows box and from the linux box? Is there any chance
that somehow the connection from the linux box is
just slower?
Best,
Erick
On Mon, Feb 1, 2016 at 6:36 PM, Alexandre Rafalovitch
wrote:
> What are you importing from? Is the source and Sol
What are you importing from? Is the source and Solr machine collocated
in the same fashion on dev and prod?
Have you tried running this on a Linux dev machine? Perhaps your prod
machine is loaded much more than a dev.
Regards,
Alex.
Newsletter and resources for Solr beginners and intermed
Do you have a full stack trace? A bit hard to help without that.
On 24 Dec 2015 2:54 pm, "Midas A" wrote:
> Hi ,
>
>
> Please provide the steps to resolve the issue.
>
>
> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException:
> Communications link failure during rollback(). Transa
That was it! Thank you!
On Fri, Dec 4, 2015 at 3:13 PM, Dyer, James
wrote:
> Brian,
>
> Be sure to have...
>
> transformer="RegexTransformer"
>
> ...in your tag. It’s the RegexTransformer class that looks
> for "splitBy".
>
> See https://wiki.apache.org/solr/DataImportHandler#RegexTransformer
Brian,
Be sure to have...
transformer="RegexTransformer"
...in your tag. It’s the RegexTransformer class that looks for
"splitBy".
See https://wiki.apache.org/solr/DataImportHandler#RegexTransformer for more
information.
James Dyer
Ingram Content Group
-Original Message-
From: Br
The backup/restore approach in SOLR-5750 and in solrcloud_manager is
really just that - copying the index files.
On backup, it saves your index directories, and on restore, it puts them
in the data dir, moves a pointer for the current index dir, and opens a
new searcher. Both are mostly just wrapp
These are just Lucene indexes. There's the Cloud backup and restore
that is being worked on.
But if the index is static (i.e. not being indexed to), simply copying
the data/index (well, actually the whole data index and subdirs)
directory will backup and restore it. Copying the index directory bac
What are the caveats regarding the copy of a collection?
At this time DIH takes only about 10 minutes. So in case of accidental
delete we can just re-run the DIH. The reason I am thinking about backup is
just in case records are deleted accidentally and the DIH cannot be run
because the database i
https://github.com/whitepages/solrcloud_manager supports 5.x, and I added
some backup/restore functionality similar to SOLR-5750 in the last
release.
Like SOLR-5750, this backup strategy requires a shared filesystem, but
note that unlike SOLR-5750, I haven’t yet added any backup functionality
for
Sorry I forgot to mention that we are using SolrCloud 5.1.0.
On Tue, Nov 17, 2015 at 12:09 PM, KNitin wrote:
> afaik Data import handler does not offer backups. You can try using the
> replication handler to backup data as you wish to any custom end point.
>
> You can also try out : https://gi
afaik Data import handler does not offer backups. You can try using the
replication handler to backup data as you wish to any custom end point.
You can also try out : https://github.com/bloomreach/solrcloud-haft. This
helps backup solr indices across clusters.
On Tue, Nov 17, 2015 at 7:08 AM, Br
Thanks for your kind reply. I tried using both sqlentityprocessor and set
batchSize to -1but didn't get any improvement. It'd be helpful if I can see
data import handler's log.
On Saturday, November 7, 2015, Alexandre Rafalovitch
wrote:
> LoL. Of course I meant SolrJ. I had to misspell the most
Yes the id is unique. If I only select distinct id,count(id) I get the same
results. However I found this is more likely a MySQL issue. I created a new
table called director1 and ran query "insert into director1 select * from
director" I got only 287041 results inserted, which was the same as Solr.
LoL. Of course I meant SolrJ. I had to misspell the most important
word of the hundreds I wrote in this thread :-)
Thank you Erick for the correction.
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/
On 7 November 2015 at 19:18, Erick Erickson wro
Alexandre, did you mean SolrJ?
Here's a way to get started
https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
Best,
Erick
On Sat, Nov 7, 2015 at 2:22 PM, Alexandre Rafalovitch
wrote:
> Have you thought of just using Solr. Might be faster than troubleshooting
> DIH for complex scenarios
That's not quite the question I asked. Do a distinct on 'id' only in
the database itself. If your ids are NOT unique, you need to create a
composite or a virtual id for Solr. Because whatever your
solrconfig.xml say is uniqueKey will be used to deduplicate the
documents. If you have 10 documents wi
Hi thanks for the continued support. I'm really worried as my project
deadline is near. It was 1636549 in MySQL vs 287041 in Solr. I put select
distinct in the beginning of the query because IMDB doesn't have a table
for cast & crew. It puts movie and person and their roles into one huge
table 'cas
Just to get the paranoid option out of the way, is 'id' actually the
column that has unique ids in your database? If you do "select
distinct id from imdb.director" - how many items do you get?
Regards,
Alex.
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-s
Have you thought of just using Solr. Might be faster than troubleshooting
DIH for complex scenarios.
On 7 Nov 2015 3:39 pm, "Yangrui Guo" wrote:
> I found multiple strange things besides the slowness. I performed count(*)
> in MySQL but only one-fifth of the records were imported. Also sometimes
I found multiple strange things besides the slowness. I performed count(*)
in MySQL but only one-fifth of the records were imported. Also sometimes
dataimporthandler either doesn't import at all or only imports a portion
of the table. How can I debug the importer?
On Saturday, November 7, 2015, Y
I just realized that not everything was ok. Three child entities were not
imported. Had set batchSize to -1 but again solr was stuck :(
On Fri, Nov 6, 2015 at 3:11 PM, Yangrui Guo wrote:
> Thanks for the reply. I just removed CacheKeyLookUp and CachedKey and used
> WHERE clause instead. Everythi
1 - 100 of 276 matches
Mail list logo