Re: Is there any other way to load the index beside using http connection?
Out of my head... but are you not supposed to active the stream-handler in SOLR ? Think it is documented... Cheers //Marcus On Mon, Jul 6, 2009 at 8:55 PM, Francis Yakin fya...@liquid.com wrote: Yes, I uploaded the CSV file that I get it from Database then I ran that cmd and I have the error. Any suggestions? Thanks Francis -Original Message- From: NitinMalik [mailto:malik.ni...@yahoo.com] Sent: Monday, July 06, 2009 11:32 AM To: solr-user@lucene.apache.org Subject: RE: Is there any other way to load the index beside using http connection? Hi Francis, I have experienced that update stream handler (for a xml file in my case) worked only for Solr running on the same machine. I also got same error when I tried to update the documents on a remote Solr instance. Regards Nitin Francis Yakin wrote: Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. -- View this message in context: http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html Sent from the Solr - User mailing list archive at Nabble.com. -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/
Re: Is there any other way to load the index beside using http connection?
Look at the error - it's bash (your command line shell) complaining. The '' terminates one command and puts it in the background. Surrounding the command with quotes will get you one step closer: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' -Yonik http://www.lucidimagination.com On Mon, Jul 6, 2009 at 2:11 PM, Francis Yakinfya...@liquid.com wrote: Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using http connection?
I did try: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' It doesn't work Francis -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, July 07, 2009 4:59 AM To: solr-user@lucene.apache.org Cc: Norberto Meijome Subject: Re: Is there any other way to load the index beside using http connection? Look at the error - it's bash (your command line shell) complaining. The '' terminates one command and puts it in the background. Surrounding the command with quotes will get you one step closer: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' -Yonik http://www.lucidimagination.com On Mon, Jul 6, 2009 at 2:11 PM, Francis Yakinfya...@liquid.com wrote: Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using http connection?
With curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' No errors now. But , how can I verify if the update happening? Thanks Francis -Original Message- From: Francis Yakin [mailto:fya...@liquid.com] Sent: Tuesday, July 07, 2009 10:37 AM To: 'solr-user@lucene.apache.org'; 'yo...@lucidimagination.com' Cc: Norberto Meijome Subject: RE: Is there any other way to load the index beside using http connection? I did try: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' It doesn't work Francis -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, July 07, 2009 4:59 AM To: solr-user@lucene.apache.org Cc: Norberto Meijome Subject: Re: Is there any other way to load the index beside using http connection? Look at the error - it's bash (your command line shell) complaining. The '' terminates one command and puts it in the background. Surrounding the command with quotes will get you one step closer: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' -Yonik http://www.lucidimagination.com On Mon, Jul 6, 2009 at 2:11 PM, Francis Yakinfya...@liquid.com wrote: Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
The double quotes around the ampersand don't belong there. I think that UTF8 should also be the default, so the following should also work: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv' -Yonik http://www.lucidimagination.com On Tue, Jul 7, 2009 at 1:37 PM, Francis Yakinfya...@liquid.com wrote: I did try: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' It doesn't work Francis -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, July 07, 2009 4:59 AM To: solr-user@lucene.apache.org Cc: Norberto Meijome Subject: Re: Is there any other way to load the index beside using http connection? Look at the error - it's bash (your command line shell) complaining. The '' terminates one command and puts it in the background. Surrounding the command with quotes will get you one step closer: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' -Yonik http://www.lucidimagination.com
RE: Is there any other way to load the index beside using http connection?
yeah, It works now. How can I verify if the new CSV file get uploaded? Thanks Francis -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, July 07, 2009 10:49 AM To: solr-user@lucene.apache.org Cc: Norberto Meijome Subject: Re: Is there any other way to load the index beside using http connection? The double quotes around the ampersand don't belong there. I think that UTF8 should also be the default, so the following should also work: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csv' -Yonik http://www.lucidimagination.com On Tue, Jul 7, 2009 at 1:37 PM, Francis Yakinfya...@liquid.com wrote: I did try: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' It doesn't work Francis -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, July 07, 2009 4:59 AM To: solr-user@lucene.apache.org Cc: Norberto Meijome Subject: Re: Is there any other way to load the index beside using http connection? Look at the error - it's bash (your command line shell) complaining. The '' terminates one command and puts it in the background. Surrounding the command with quotes will get you one step closer: curl 'http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8' -Yonik http://www.lucidimagination.com
Re: Is there any other way to load the index beside using http connection?
On Tue, Jul 7, 2009 at 1:50 PM, Francis Yakinfya...@liquid.com wrote: yeah, It works now. How can I verify if the new CSV file get uploaded? point your browser at http://localhost:8983/solr/admin/stats.jsp Check out the UPDATE HANDLERS section -Yonik http://www.lucidimagination.com
RE: Is there any other way to load the index beside using http connection?
Norberto, You said last week: why not generate your SQL output directly into your oracle server as a file, upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) I think this is the best solution that we are going to without changing too much on our setup. Like said we have file name test.xml which come from SQL output , we put it locally on the solr server under /opt/test.xml So, I need to execute the commands from solr system to add and update this to the solr data/indexes. What commands do I have to use, for example the xml file named /opt/test.xml ? Thanks Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Sunday, July 05, 2009 3:57 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Thu, 2 Jul 2009 11:02:28 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks for your input. What do you mean with Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. ? Here are how our servers lay out: 1) Database ( Oracle ) is running on separate machine 2) Solr master is running on separate machine by itself 3) 6 solr slaves ( these 6 pulll the index from master using rsync) We have a SQL(Oracle) script to post the data/index from Oracle Database machine to Solr Master over http. We wrote those script(Someone in Oracle Database administrator write it). You said in your other email you are having issues with slow transfers between 1) and 2). Your subject relates to the data transfer between 1) and 2, - 2) and 3) is irrelevant to this part. My question (what you quoted above) relates to the point you made about it being slow ( WHY is it slow?), and issues with opening so many connections through firewall. so, I'll rephrase my question (see below...) [] We can not do localhost since it's solr is not running on Oracle machine. why not generate your SQL output directly into your oracle server as a file, upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. Another alternative that we think of is to transform XML into CSV and import/export it. How about if LUSQL, some mentioned about this? Is this apps free(open source) application? Do you have any experience with this apps? Not i, sorry. Have you looked into DIH? It's designed for this kind of work. B _ {Beto|Norberto|Numard} Meijome Great spirits have often encountered violent opposition from mediocre minds. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
On Sun, 5 Jul 2009 21:36:35 +0200 Marcus Herou marcus.he...@tailsweep.com wrote: Sharing some of our exports from DB to solr. Note: many of the statements below might not work due to clip-clip. thx Marcus - but that's a DIH config right? :) b _ {Beto|Norberto|Numard} Meijome I respect faith, but doubt is what gives you an education. Wilson Mizner I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
On Sun, 5 Jul 2009 10:28:16 -0700 Francis Yakin fya...@liquid.com wrote: [...] upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) How we upload the file? Do we need to convert the data file to Lucene Index first? And Documentation how we do this? pick your poison... rsync? ftp? scp ? B _ {Beto|Norberto|Numard} Meijome The freethinking of one age is the common sense of the next. Matthew Arnold I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
Yes exactly just being friendly sharing a working routine. Took me some hours to figure out DIH myself at the time. //Marcus On Mon, Jul 6, 2009 at 1:32 PM, Norberto Meijome numard...@gmail.comwrote: On Sun, 5 Jul 2009 21:36:35 +0200 Marcus Herou marcus.he...@tailsweep.com wrote: Sharing some of our exports from DB to solr. Note: many of the statements below might not work due to clip-clip. thx Marcus - but that's a DIH config right? :) b _ {Beto|Norberto|Numard} Meijome I respect faith, but doubt is what gives you an education. Wilson Mizner I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.he...@tailsweep.com http://www.tailsweep.com/
Re: Is there any other way to load the index beside using http connection?
On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using http connection?
Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using http connection?
Hi Francis, I have experienced that update stream handler (for a xml file in my case) worked only for Solr running on the same machine. I also got same error when I tried to update the documents on a remote Solr instance. Regards Nitin Francis Yakin wrote: Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. -- View this message in context: http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is there any other way to load the index beside using http connection?
Yes, I uploaded the CSV file that I get it from Database then I ran that cmd and I have the error. Any suggestions? Thanks Francis -Original Message- From: NitinMalik [mailto:malik.ni...@yahoo.com] Sent: Monday, July 06, 2009 11:32 AM To: solr-user@lucene.apache.org Subject: RE: Is there any other way to load the index beside using http connection? Hi Francis, I have experienced that update stream handler (for a xml file in my case) worked only for Solr running on the same machine. I also got same error when I tried to update the documents on a remote Solr instance. Regards Nitin Francis Yakin wrote: Ok, I have a CSV file(called it test.csv) from database. When I tried to upload this file to solr using this cmd, I got stream.contentType=text/plain: No such file or directory error curl http://localhost:8983/solr/update/csv?stream.file=/opt/apache-1.2.0/example/exampledocs/test.csvstream.contentType=text/plain;charset=utf-8 -bash: stream.contentType=text/plain: No such file or directory undefined field cat What did I do wrong? Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Monday, July 06, 2009 11:01 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Mon, 6 Jul 2009 09:56:03 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks, I think my questions is: why not generate your SQL output directly into your oracle server as a file What type of file is this? a file in a format that you can then import into SOLR. _ {Beto|Norberto|Numard} Meijome Gravity cannot be blamed for people falling in love. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. -- View this message in context: http://www.nabble.com/Is-there-any-other-way-to-load-the-index-beside-using-%22http%22-connection--tp24297934p24360603.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is there any other way to load the index beside using http connection?
On Thu, 2 Jul 2009 11:28:51 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Hi Francis, Please reply to the list, or keep it in CC. You saying: Other alternatives are to transform the XML into csv and import it that way How do you transfer that CSV file to Solr? http://wiki.apache.org/solr/UpdateCSV There actually is a LOT of information in the wiki, as well as the mailing list archives. good luck, B _ {Beto|Norberto|Numard} Meijome The freethinking of one age is the common sense of the next. Matthew Arnold I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
On Thu, 2 Jul 2009 11:02:28 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks for your input. What do you mean with Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. ? Here are how our servers lay out: 1) Database ( Oracle ) is running on separate machine 2) Solr master is running on separate machine by itself 3) 6 solr slaves ( these 6 pulll the index from master using rsync) We have a SQL(Oracle) script to post the data/index from Oracle Database machine to Solr Master over http. We wrote those script(Someone in Oracle Database administrator write it). You said in your other email you are having issues with slow transfers between 1) and 2). Your subject relates to the data transfer between 1) and 2, - 2) and 3) is irrelevant to this part. My question (what you quoted above) relates to the point you made about it being slow ( WHY is it slow?), and issues with opening so many connections through firewall. so, I'll rephrase my question (see below...) [] We can not do localhost since it's solr is not running on Oracle machine. why not generate your SQL output directly into your oracle server as a file, upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. Another alternative that we think of is to transform XML into CSV and import/export it. How about if LUSQL, some mentioned about this? Is this apps free(open source) application? Do you have any experience with this apps? Not i, sorry. Have you looked into DIH? It's designed for this kind of work. B _ {Beto|Norberto|Numard} Meijome Great spirits have often encountered violent opposition from mediocre minds. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using http connection?
Norberto, Yes, DIH is one of the option we think to use, but it's required 1.3.0 and above and currently we are running Sol 1.2.0. I am thinking to use CSV file(Convert the XML to CSV format in Database machine( , then transport that CSV file to solr box. In Solr we run the update to convert the CSV file to Lucene index. Also , we think the one you suggested, note my question below: why not generate your SQL output directly into your oracle server as a file, question: What type of file is this(XML or CSV)? upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) How we upload the file? Do we need to convert the data file to Lucene Index first? And Documentation how we do this? Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Sunday, July 05, 2009 3:57 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Thu, 2 Jul 2009 11:02:28 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks for your input. What do you mean with Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. ? Here are how our servers lay out: 1) Database ( Oracle ) is running on separate machine 2) Solr master is running on separate machine by itself 3) 6 solr slaves ( these 6 pulll the index from master using rsync) We have a SQL(Oracle) script to post the data/index from Oracle Database machine to Solr Master over http. We wrote those script(Someone in Oracle Database administrator write it). You said in your other email you are having issues with slow transfers between 1) and 2). Your subject relates to the data transfer between 1) and 2, - 2) and 3) is irrelevant to this part. My question (what you quoted above) relates to the point you made about it being slow ( WHY is it slow?), and issues with opening so many connections through firewall. so, I'll rephrase my question (see below...) [] We can not do localhost since it's solr is not running on Oracle machine. why not generate your SQL output directly into your oracle server as a file, upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. Another alternative that we think of is to transform XML into CSV and import/export it. How about if LUSQL, some mentioned about this? Is this apps free(open source) application? Do you have any experience with this apps? Not i, sorry. Have you looked into DIH? It's designed for this kind of work. B _ {Beto|Norberto|Numard} Meijome Great spirits have often encountered violent opposition from mediocre minds. Albert Einstein I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
Sharing some of our exports from DB to solr. Note: many of the statements below might not work due to clip-clip. $SOLR_HOME/conf/dataConfig.xml dataConfig dataSource name=myfilereader type=FileDataSource / document entity name=jc rootEntity=false dataSource=null processor=FileListEntityProcessor fileName=^.*\.xml$ recursive=false baseDir=$dumpdir entity name=x pk=uid processor=XPathEntityProcessor url=${jc.fileAbsolutePath} forEach=/entries/entry transformer=DateFormatTransformer stream=true dataSource=myfilereader field column=uid xpath=/entries/entry/uid / !-- sorry hiding the rest-- /entity /entity /document /dataConfig # Then add this to solrconfig.xml requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler restart solr Issue a mysql dump mysql --xml -uXXX -pXXX -hXXX -DXXX -e select MD5(link) as uid,DATE_FORMAT(publishedDate, \%Y-%m-%dT%H:%i:%sZ\) as publishedDate from X $dumpdir/dump.xml # Warning: Note the clean command which will wipe your index... GET http:// $server:$port/$path/dataimport?command=full-importclean=trueoptimize=true Hope this helps out some. Cheers //Marcus On Sun, Jul 5, 2009 at 7:28 PM, Francis Yakin fya...@liquid.com wrote: Norberto, Yes, DIH is one of the option we think to use, but it's required 1.3.0 and above and currently we are running Sol 1.2.0. I am thinking to use CSV file(Convert the XML to CSV format in Database machine( , then transport that CSV file to solr box. In Solr we run the update to convert the CSV file to Lucene index. Also , we think the one you suggested, note my question below: why not generate your SQL output directly into your oracle server as a file, question: What type of file is this(XML or CSV)? upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) How we upload the file? Do we need to convert the data file to Lucene Index first? And Documentation how we do this? Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Sunday, July 05, 2009 3:57 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Thu, 2 Jul 2009 11:02:28 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks for your input. What do you mean with Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. ? Here are how our servers lay out: 1) Database ( Oracle ) is running on separate machine 2) Solr master is running on separate machine by itself 3) 6 solr slaves ( these 6 pulll the index from master using rsync) We have a SQL(Oracle) script to post the data/index from Oracle Database machine to Solr Master over http. We wrote those script(Someone in Oracle Database administrator write it). You said in your other email you are having issues with slow transfers between 1) and 2). Your subject relates to the data transfer between 1) and 2, - 2) and 3) is irrelevant to this part. My question (what you quoted above) relates to the point you made about it being slow ( WHY is it slow?), and issues with opening so many connections through firewall. so, I'll rephrase my question (see below...) [] We can not do localhost since it's solr is not running on Oracle machine. why not generate your SQL output directly into your oracle server as a file, upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. Another alternative that we think of is to transform XML into CSV and import/export it. How about if LUSQL, some mentioned about this? Is this apps free(open source) application? Do you have any experience with this apps? Not i, sorry. Have you looked into DIH
RE: Is there any other way to load the index beside using http connection?
Thanks Marcus, I will give a try to a test machine first. Francis -Original Message- From: Marcus Herou [mailto:marcus.he...@tailsweep.com] Sent: Sunday, July 05, 2009 12:37 PM To: solr-user@lucene.apache.org Cc: Norberto Meijome Subject: Re: Is there any other way to load the index beside using http connection? Sharing some of our exports from DB to solr. Note: many of the statements below might not work due to clip-clip. $SOLR_HOME/conf/dataConfig.xml dataConfig dataSource name=myfilereader type=FileDataSource / document entity name=jc rootEntity=false dataSource=null processor=FileListEntityProcessor fileName=^.*\.xml$ recursive=false baseDir=$dumpdir entity name=x pk=uid processor=XPathEntityProcessor url=${jc.fileAbsolutePath} forEach=/entries/entry transformer=DateFormatTransformer stream=true dataSource=myfilereader field column=uid xpath=/entries/entry/uid / !-- sorry hiding the rest-- /entity /entity /document /dataConfig # Then add this to solrconfig.xml requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler restart solr Issue a mysql dump mysql --xml -uXXX -pXXX -hXXX -DXXX -e select MD5(link) as uid,DATE_FORMAT(publishedDate, \%Y-%m-%dT%H:%i:%sZ\) as publishedDate from X $dumpdir/dump.xml # Warning: Note the clean command which will wipe your index... GET http:// $server:$port/$path/dataimport?command=full-importclean=trueoptimize=true Hope this helps out some. Cheers //Marcus On Sun, Jul 5, 2009 at 7:28 PM, Francis Yakin fya...@liquid.com wrote: Norberto, Yes, DIH is one of the option we think to use, but it's required 1.3.0 and above and currently we are running Sol 1.2.0. I am thinking to use CSV file(Convert the XML to CSV format in Database machine( , then transport that CSV file to solr box. In Solr we run the update to convert the CSV file to Lucene index. Also , we think the one you suggested, note my question below: why not generate your SQL output directly into your oracle server as a file, question: What type of file is this(XML or CSV)? upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) How we upload the file? Do we need to convert the data file to Lucene Index first? And Documentation how we do this? Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach without changing too much of your current setup. -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Sunday, July 05, 2009 3:57 AM To: Francis Yakin Cc: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? On Thu, 2 Jul 2009 11:02:28 -0700 Francis Yakin fya...@liquid.com wrote: Norberto, Thanks for your input. What do you mean with Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. ? Here are how our servers lay out: 1) Database ( Oracle ) is running on separate machine 2) Solr master is running on separate machine by itself 3) 6 solr slaves ( these 6 pulll the index from master using rsync) We have a SQL(Oracle) script to post the data/index from Oracle Database machine to Solr Master over http. We wrote those script(Someone in Oracle Database administrator write it). You said in your other email you are having issues with slow transfers between 1) and 2). Your subject relates to the data transfer between 1) and 2, - 2) and 3) is irrelevant to this part. My question (what you quoted above) relates to the point you made about it being slow ( WHY is it slow?), and issues with opening so many connections through firewall. so, I'll rephrase my question (see below...) [] We can not do localhost since it's solr is not running on Oracle machine. why not generate your SQL output directly into your oracle server as a file, upload the file to your SOLR server? Then the data file is local to your SOLR server , you will bypass any WAN and firewall you may be having. (or some variation of it, sql - SOLR server as file, etc..) Any speed issues that are rooted in the fact that you are posting via HTTP (vs embedded solr or DIH) aren't going to go away. But it's the simpler approach
Re: Is there any other way to load the index beside using http connection?
On Wed, 1 Jul 2009 15:07:12 -0700 Francis Yakin fya...@liquid.com wrote: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Francis, after reading the whole thread, it seems you have : - Data source : Oracle DB, on separate location to your SOLR. - Data format : XML output. definitely DIH is a great option, but since you are on 1.2, not available to you (you should look into upgrading if you can!). Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. Also make sure not to commit until you really needed. Other alternatives are to transform the XML into csv and import it that way. Or write a simple app that will parse the xml and post it directly using the embedded solr method. plenty of options, all of them documented @ solr's site. good luck, b _ {Beto|Norberto|Numard} Meijome People demand freedom of speech to make up for the freedom of thought which they avoid. Soren Aabye Kierkegaard I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Is there any other way to load the index beside using http connection?
Francis, I think both of these are on the Solr Wiki. You'll have to figure out how to export from DB yourself, and you'll probably write a script/tool to read the export and rewrite it in the csv format. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Francis Yakin fya...@liquid.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Thursday, July 2, 2009 12:26:14 AM Subject: RE: Is there any other way to load the index beside using http connection? How you import the documents as csv data/file from Oracle Database to Sol master( they are two different machines)? And you have the doc for using EmbeddedSolrServer? Thanks Otis! Francis -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, July 01, 2009 8:01 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? Francis, There are a number of things you can do to make indexing over HTTP faster. You can also import documents as csv data/file. Finally, you can use EmbeddedSolrServer. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Francis Yakin To: solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 6:07:12 PM Subject: Is there any other way to load the index beside using http connection? We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis
Re: Is there any other way to load the index beside using http connection?
LuSql can be found here: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql User Manual: http://cuvier.cisti.nrc.ca/~gnewton/lusql/v0.9/lusqlManual.pdf.html LuSql can communicate directly with Oracle and create a Lucene index for you. Of course - as mentioned by other posters - you need to make sure the versions of Lucene and Solr are compatible (use same jars), you use the same Analyzers, and you create the appropriate 'schema' that Solr understands. -glen 2009/7/2 Francis Yakin fya...@liquid.com: Glen, Database we use is Oracle, I am not the database administrator, so I don't familiar with their script. SO, basically we have the Oracle SQL script to load the XML files over HTTP connection to our Solr Master. My question is there any other way instead of using HTTP connection to load the XML files to our SOLR Master? You mentioned about LuSql, I am not familiar with that. Can you provide us the docs or something? Again I am not the database Guys, I am only the solr Guy. The database we have is a different box than Solr master and both are running linux(RedHat). Thanks Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Wednesday, July 01, 2009 8:06 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- - -- -
Re: Is there any other way to load the index beside using http connection?
Are you saying that we have to use LuSql replacing our Solr? To load your data: Yes, it is an option To search your data: No, LuSql is only a loading tool -glen 2009/7/2 Francis Yakin fya...@liquid.com: Glen, Are you saying that we have to use LuSql replacing our Solr? Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Wednesday, July 01, 2009 8:06 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- - -- -
RE: Is there any other way to load the index beside using http connection?
Norberto, Thanks for your input. What do you mean with Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. ? Here are how our servers lay out: 1) Database ( Oracle ) is running on separate machine 2) Solr master is running on separate machine by itself 3) 6 solr slaves ( these 6 pulll the index from master using rsync) We have a SQL(Oracle) script to post the data/index from Oracle Database machine to Solr Master over http. We wrote those script(Someone in Oracle Database administrator write it). In Solr master configuration we have scripts.conf that like this: user= solr_hostname=localhost solr_port=7001 rsyncd_port=18983 data_dir= webapp_name=solr master_host=localhost master_data_dir=solr/snapshot master_status_dir=solr/status So, basically from Oracle system we launch the Oracle/SQL script posting the data to Solr Master using http://solrmaster/solr/update ( inside the SQL script we put this). We can not do localhost since it's solr is not running on Oracle machine. Another alternative that we think of is to transform XML into CSV and import/export it. How about if LUSQL, some mentioned about this? Is this apps free(open source) application? Do you have any experience with this apps? Thanks All for your valuable suggestions! Francis -Original Message- From: Norberto Meijome [mailto:numard...@gmail.com] Sent: Thursday, July 02, 2009 3:01 AM To: solr-user@lucene.apache.org Cc: Francis Yakin Subject: Re: Is there any other way to load the index beside using http connection? On Wed, 1 Jul 2009 15:07:12 -0700 Francis Yakin fya...@liquid.com wrote: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Francis, after reading the whole thread, it seems you have : - Data source : Oracle DB, on separate location to your SOLR. - Data format : XML output. definitely DIH is a great option, but since you are on 1.2, not available to you (you should look into upgrading if you can!). Have you tried connecting to SOLR over HTTP from localhost, therefore avoiding any firewall issues and network latency ? it should work a LOT faster than from a remote site. Also make sure not to commit until you really needed. Other alternatives are to transform the XML into csv and import it that way. Or write a simple app that will parse the xml and post it directly using the embedded solr method. plenty of options, all of them documented @ solr's site. good luck, b _ {Beto|Norberto|Numard} Meijome People demand freedom of speech to make up for the freedom of thought which they avoid. Soren Aabye Kierkegaard I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
RE: Is there any other way to load the index beside using http connection?
Glen, Is this LuSql is free? Is that an open source. Is that requires a separate machine with Solr Master I forgot to tell you that we have Master/Slaves environment of Solr. The Database is running Oracle and it's separate machine that running in different network than Master and Slaves Solr(There is a firewall between Oracle machine and Solr Machines). If we have LuSql Machine, do you think it's better to put into the same network with DataBase machine or Solr machines? Do I need to create a sql script to get the data from Oarcle and loading it using LuSql and convert it to Lucene index, and how solr master will get that data? Thanks Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Thursday, July 02, 2009 8:22 AM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? LuSql can be found here: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql User Manual: http://cuvier.cisti.nrc.ca/~gnewton/lusql/v0.9/lusqlManual.pdf.html LuSql can communicate directly with Oracle and create a Lucene index for you. Of course - as mentioned by other posters - you need to make sure the versions of Lucene and Solr are compatible (use same jars), you use the same Analyzers, and you create the appropriate 'schema' that Solr understands. -glen 2009/7/2 Francis Yakin fya...@liquid.com: Glen, Database we use is Oracle, I am not the database administrator, so I don't familiar with their script. SO, basically we have the Oracle SQL script to load the XML files over HTTP connection to our Solr Master. My question is there any other way instead of using HTTP connection to load the XML files to our SOLR Master? You mentioned about LuSql, I am not familiar with that. Can you provide us the docs or something? Again I am not the database Guys, I am only the solr Guy. The database we have is a different box than Solr master and both are running linux(RedHat). Thanks Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Wednesday, July 01, 2009 8:06 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- - -- -
Re: Is there any other way to load the index beside using http connection?
2009/7/2 Francis Yakin fya...@liquid.com: Glen, Is this LuSql is free? Is that an open source. LuSql is an Open Source project. Is that requires a separate machine with Solr Master LuSql is a Java application that runs on the command line. It connects to a the database using JDBC and creates a local Lucene index, based on the configuration you supply to it. I forgot to tell you that we have Master/Slaves environment of Solr. The Database is running Oracle and it's separate machine that running in different network than Master and Slaves Solr(There is a firewall between Oracle machine and Solr Machines). If we have LuSql Machine, do you think it's better to put into the same network with DataBase machine or Solr machines? LuSql is heavily multi-threaded, and can suck up the resources of all cores (this is why it runs so fast), so you need to decide if this is not appropriate for your database machine (i.e. if it is a production machine). You can isolate LuSql to specific cores using something like numactl http://www.linuxmanpages.com/man8/numactl.8.php Do I need to create a sql script to get the data from Oarcle and loading it using LuSql and convert it to Lucene index, and how solr master will get that data? LuSql reads from Oracle and writes to a Lucene index. You just need to give LuSql a configuration that has it generate the appropriate index for Solr. thanks, Glen http://zzzoot.blogspot.com/search?q=lucene Thanks Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Thursday, July 02, 2009 8:22 AM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? LuSql can be found here: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql User Manual: http://cuvier.cisti.nrc.ca/~gnewton/lusql/v0.9/lusqlManual.pdf.html LuSql can communicate directly with Oracle and create a Lucene index for you. Of course - as mentioned by other posters - you need to make sure the versions of Lucene and Solr are compatible (use same jars), you use the same Analyzers, and you create the appropriate 'schema' that Solr understands. -glen 2009/7/2 Francis Yakin fya...@liquid.com: Glen, Database we use is Oracle, I am not the database administrator, so I don't familiar with their script. SO, basically we have the Oracle SQL script to load the XML files over HTTP connection to our Solr Master. My question is there any other way instead of using HTTP connection to load the XML files to our SOLR Master? You mentioned about LuSql, I am not familiar with that. Can you provide us the docs or something? Again I am not the database Guys, I am only the solr Guy. The database we have is a different box than Solr master and both are running linux(RedHat). Thanks Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Wednesday, July 01, 2009 8:06 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- - -- - -- -
Is there any other way to load the index beside using http connection?
We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis
Re: Is there any other way to load the index beside using http connection?
Francis, There are a number of things you can do to make indexing over HTTP faster. You can also import documents as csv data/file. Finally, you can use EmbeddedSolrServer. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Francis Yakin fya...@liquid.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 6:07:12 PM Subject: Is there any other way to load the index beside using http connection? We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis
Re: Is there any other way to load the index beside using http connection?
You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- -
RE: Is there any other way to load the index beside using http connection?
Otis, Do you have the document how to do those things that you mentioned? How about if I don't want use HHTP at all? Or we have no other option that we have to use HHTP to transfer the XML files to Solr master from Db box? Thanks Francis -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, July 01, 2009 8:01 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? Francis, There are a number of things you can do to make indexing over HTTP faster. You can also import documents as csv data/file. Finally, you can use EmbeddedSolrServer. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Francis Yakin fya...@liquid.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 6:07:12 PM Subject: Is there any other way to load the index beside using http connection? We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis
RE: Is there any other way to load the index beside using http connection?
Glen, Database we use is Oracle, I am not the database administrator, so I don't familiar with their script. SO, basically we have the Oracle SQL script to load the XML files over HTTP connection to our Solr Master. My question is there any other way instead of using HTTP connection to load the XML files to our SOLR Master? You mentioned about LuSql, I am not familiar with that. Can you provide us the docs or something? Again I am not the database Guys, I am only the solr Guy. The database we have is a different box than Solr master and both are running linux(RedHat). Thanks Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Wednesday, July 01, 2009 8:06 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- -
RE: Is there any other way to load the index beside using http connection?
Glen, Are you saying that we have to use LuSql replacing our Solr? Francis -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Wednesday, July 01, 2009 8:06 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? You can directly load to the backend Lucene using LuSql[1]. It is faster than Solr, sometimes as much as an order of magnitude faster. Disclosure: I am the author of LuSql -Glen http://zzzoot.blogspot.com/ [1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql 2009/7/1 Francis Yakin fya...@liquid.com: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- -
RE: Is there any other way to load the index beside using http connection?
How you import the documents as csv data/file from Oracle Database to Sol master( they are two different machines)? And you have the doc for using EmbeddedSolrServer? Thanks Otis! Francis -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, July 01, 2009 8:01 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? Francis, There are a number of things you can do to make indexing over HTTP faster. You can also import documents as csv data/file. Finally, you can use EmbeddedSolrServer. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Francis Yakin fya...@liquid.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 6:07:12 PM Subject: Is there any other way to load the index beside using http connection? We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis
Re: Is there any other way to load the index beside using http connection?
did you explore DIH http://wiki.apache.org/solr/DataImportHandler it has features to import from Db, xml files etc On Thu, Jul 2, 2009 at 3:37 AM, Francis Yakinfya...@liquid.com wrote: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- - Noble Paul | Principal Engineer| AOL | http://aol.com
RE: Is there any other way to load the index beside using http connection?
Thanks Noble! This is only for version 1.3.0? We are running 1.2.0 currently. Francis -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Wednesday, July 01, 2009 9:43 PM To: solr-user@lucene.apache.org Subject: Re: Is there any other way to load the index beside using http connection? did you explore DIH http://wiki.apache.org/solr/DataImportHandler it has features to import from Db, xml files etc On Thu, Jul 2, 2009 at 3:37 AM, Francis Yakinfya...@liquid.com wrote: We have several thousands of xml files in database that we load it to solr master The Database uses http connection and transfer those files to solr master. Solr then translate xml files to their lindex. We are experiencing issue with close/open connection in the firewall and very very slow. Is there any other way to load the data/index from Database to solr master beside using http connection, so it means we just scp/ftp the xml file from Database system to solr master and let solr convert those to lucene indexes? Any input or help will be much appreciated. Thanks Francis -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Is there any other way to load the index beside using http connection?
On Thu, Jul 2, 2009 at 10:24 AM, Francis Yakin fya...@liquid.com wrote: This is only for version 1.3.0? We are running 1.2.0 currently. Yes, DIH is available since 1.3 only. -- Regards, Shalin Shekhar Mangar.