Re: Mysql data import issue
Thanks Gora for your Suggestions. Since my table contains lot of fields, and all the other fields have same name mapping for solr and mysql, I thought I can give mapping for the one which is different and leave remaining as is. But is not selecting the id field in the returned query the only way to achieve that? Thanks, Sriram -- View this message in context: http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998p4254210.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Mysql data import issue
On 29 January 2016 at 04:13, vsriram30 wrote: > Hi, > I am using Solr 4.6.1 and I am trying to import my data from mysql to solr. > > In mysql, I have a table with columns, > id, legacyid, otherfields... > [...] > But still I get mysql id field to solr id field mapping. Please let me know > how to prevent this from happening. > How about if you do not select the mysql "id" field in the query attribute for the entity? Regards, Gora
Mysql data import issue
Hi, I am using Solr 4.6.1 and I am trying to import my data from mysql to solr. In mysql, I have a table with columns, id, legacyid, otherfields... In solr I have columns : id, other fields. I want to map the legacyid field in my mysql table with Solr'r id column and skip the "id" field of mysql while doing import. Hence I have a mapping, But still I get one to one mapping of my mysql id field to solr's id field. Can you please let me know how to prevent this from happening? I even mapped id field of mysql to empty solr field. But still I get mysql id field to solr id field mapping. Please let me know how to prevent this from happening. Thanks, Sriram -- View this message in context: http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: MySQL Data import handler
Hi Baskar, Just create a single schema.xml which should contains required fields from 3 tables. Add a status column to child table.i.e 1 = add 2 = update 3 = delete 4 = indexed Etc Write a program using solrj which will read the status and do thing accordingly. Rgds AJ On 15-Sep-2013, at 5:46, Baskar Sikkayan wrote: > Hi, > If i am supposed to go with Java client, should i still do any > configurations in solrconfig.xml or schema.xml. > > Thanks, > Baskar.S > > > On Sat, Sep 14, 2013 at 8:46 PM, Gora Mohanty wrote: > >> On 14 September 2013 20:07, Baskar Sikkayan wrote: >>> Hi Gora, >>>Thanks a lot for your reply. >>> My requirement is to combine 3 tables in mysql for search operation and >>> planning to sync these 3 tables( not all the columns ) in Apache Solr. >>> Whenever there is any change( adding a new row, deleting a row, modifying >>> the column data( any column in the 3 tables ) ), the same has to updated >> in >>> solr. Guess, for this requirement, instead of going with delta-import, >>> Apachae Solar java client will be of useful. >> [...] >> >> Yes, if you are comfortable with programming in Java, >> the Solr client would be a good alternative, though the >> DataImportHandler can also do what you want. >> >> Regards, >> Gora >>
Re: MySQL Data import handler
Hi, If i am supposed to go with Java client, should i still do any configurations in solrconfig.xml or schema.xml. Thanks, Baskar.S On Sat, Sep 14, 2013 at 8:46 PM, Gora Mohanty wrote: > On 14 September 2013 20:07, Baskar Sikkayan wrote: > > Hi Gora, > > Thanks a lot for your reply. > > My requirement is to combine 3 tables in mysql for search operation and > > planning to sync these 3 tables( not all the columns ) in Apache Solr. > > Whenever there is any change( adding a new row, deleting a row, modifying > > the column data( any column in the 3 tables ) ), the same has to updated > in > > solr. Guess, for this requirement, instead of going with delta-import, > > Apachae Solar java client will be of useful. > [...] > > Yes, if you are comfortable with programming in Java, > the Solr client would be a good alternative, though the > DataImportHandler can also do what you want. > > Regards, > Gora >
Re: MySQL Data import handler
On 14 September 2013 20:07, Baskar Sikkayan wrote: > Hi Gora, > Thanks a lot for your reply. > My requirement is to combine 3 tables in mysql for search operation and > planning to sync these 3 tables( not all the columns ) in Apache Solr. > Whenever there is any change( adding a new row, deleting a row, modifying > the column data( any column in the 3 tables ) ), the same has to updated in > solr. Guess, for this requirement, instead of going with delta-import, > Apachae Solar java client will be of useful. [...] Yes, if you are comfortable with programming in Java, the Solr client would be a good alternative, though the DataImportHandler can also do what you want. Regards, Gora
Re: MySQL Data import handler
Hi Gora, Thanks a lot for your reply. My requirement is to combine 3 tables in mysql for search operation and planning to sync these 3 tables( not all the columns ) in Apache Solr. Whenever there is any change( adding a new row, deleting a row, modifying the column data( any column in the 3 tables ) ), the same has to updated in solr. Guess, for this requirement, instead of going with delta-import, Apachae Solar java client will be of useful. Could you please share your view? Thanks, Baskar.S On Sat, Sep 14, 2013 at 7:25 PM, Gora Mohanty wrote: > On 14 September 2013 18:46, Baskar Sikkayan wrote: > > Hi, > > I am new to solar and trying for MySQL data import handler. > > > > I have 3 tables in mysql. > [...] > > Your question is unclear, and you would probably benefit > by doing some basic homework on Solr and importing > data into it. Please start from the Solr Wiki: > http://wiki.apache.org/solr/ > > For indexing from a database, you can use the DataImportHandler: > http://wiki.apache.org/solr/DataImportHandler > http://wiki.apache.org/solr/DIHQuickStart > What you want should be pretty straightforward. After > setting up basic data import, please refer to delta-import > for adding new data to Solr: > http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command > > Regards, > Gora >
Re: MySQL Data import handler
On 14 September 2013 18:46, Baskar Sikkayan wrote: > Hi, > I am new to solar and trying for MySQL data import handler. > > I have 3 tables in mysql. [...] Your question is unclear, and you would probably benefit by doing some basic homework on Solr and importing data into it. Please start from the Solr Wiki: http://wiki.apache.org/solr/ For indexing from a database, you can use the DataImportHandler: http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DIHQuickStart What you want should be pretty straightforward. After setting up basic data import, please refer to delta-import for adding new data to Solr: http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command Regards, Gora
MySQL Data import handler
Hi, I am new to solar and trying for MySQL data import handler. I have 3 tables in mysql. 1) user 2) mast_data 3) child_data The child_data table has foreign key ref of master_data and user. In child_data table, i have the following columns. 1) date 2) fee 3) location 4) type 5) text 6) user_id 7) master_data_id In the user table, display name field might get changed at any time. In the master table certain fields must be indexed in solar. So combining these 3 tables and trying to index . this is for search page ... the data will get added to these tables at any time and also the fields might get changed at any time I am not sure, how to keep the solr updated whenever there is any change in the table ... Also, looking for deleting the index in solr based on the date field Please help me on this Thanks, Baskar.S
Re: MySQL data import
On 12/11/2011 1:54 PM, Brian Lamb wrote: By nature of my schema, I have several multivalued fields. Each one I populate with a separate entity. Is there a better way to do it? For example, could I pull in all the singular data in one sitting and then come back in later and populate with the multivalued items. An alternate approach in some cases would be to do a GROUP_CONCAT and then populate the multivalued column with some transformation. Is that possible? Lastly, is it possible to use copyField to copy three regular fields into one multiValued field and have all the data show up? The best way to proceed may depend on whether you actually need the field to be multivalued (returning an array in search results), or if you simply need to be able to search on all the values. For me, it's the latter - the field isn't stored. I use the GROUP_CONCAT method (hidden in a database view, so Solr doesn't need to know about it) to put multiple values into a field, separated by semicolons. I then use the following single-valued fieldType to split those up and make all the values searchable. The tokenizer splits by semicolons followed by zero or more spaces, the pattern filter strips leading and trailing punctuation from each token. The ICU filter is basically a better implementation of the ascii folding filter and the lowercase filter, in a single pass. The others are fairly self-explanatory: positionIncrementGap="0" omitNorms="true"> If you actually do need the field to be multivalued, then you'll need to do dataimport transformation as mentioned by Gora, who also replied. Thanks, Shawn
Re: MySQL data import
Here's a quick demo I wrote at one point. I haven't run it in a while, but you should be able to get the idea. package jdbc; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.common.SolrInputDocument; import java.io.IOException; import java.sql.*; import java.util.ArrayList; import java.util.Collection; public class Indexer { public static void main(String[] args) { startIndex("http://localhost:8983/solr";); } private static void startIndex(String url) { Connection con = DataSource.getConnection(); try { long start = System.currentTimeMillis(); // Create a multi-threaded communications channel to the Solr server. Full interface (3.3) at: // http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html StreamingUpdateSolrServer server = new StreamingUpdateSolrServer(url, 10, 4); // You may want to set these timeouts higer, Solr occasionally will have long pauses while // segments merge. server.setSoTimeout(1000); // socket read timeout server.setConnectionTimeout(100); //server.setDefaultMaxConnectionsPerHost(100); //server.setMaxTotalConnections(100); //server.setFollowRedirects(false); // defaults to false // allowCompression defaults to false. // Server side must support gzip or deflate for this to have any effect. //server.setAllowCompression(true); server.setMaxRetries(1); // defaults to 0. > 1 not recommended. server.setParser(new XMLResponseParser()); // binary parser is used by default doDocuments(server, con); server.commit(); // Only needs to be done at the end, autocommit or commitWithin should // do the rest. long endTime = System.currentTimeMillis(); System.out.println("Total Time Taken->" + (endTime - start) + " mils"); } catch (Exception e) { e.printStackTrace(); String msg = e.getMessage(); System.out.println(msg); } } private static void doDocuments(StreamingUpdateSolrServer server, Connection con) throws SQLException, IOException, SolrServerException { Statement st = con.createStatement(); ResultSet rs = st.executeQuery("select id,title,text from test"); // SolrInputDocument interface (3.3) at // http://lucene.apache.org/solr/api/org/apache/solr/common/SolrInputDocument.html Collection docs = new ArrayList(); int total = 0; int counter = 0; while (rs.next()) { SolrInputDocument doc = new SolrInputDocument(); // DO NOT move this outside the while loop // or be sure to call doc.clear() String id = rs.getString("id"); String title = rs.getString("title"); String text = rs.getString("text"); doc.addField("id", id); doc.addField("title", title); doc.addField("text", text); docs.add(doc); ++counter; ++total; if (counter > 1000) { // Completely arbitrary, just batch up more than one document for throughput! server.add(docs); docs.clear(); counter = 0; } } System.out.println("Total " + total + " Docs added succesfully"); } } // Trivial class showing connecting to a MySql database server via jdbc... class DataSource { public static Connection getConnection() { Connection conn = null; try { Class.forName("com.mysql.jdbc.Driver").newInstance(); System.out.println("Driver Loaded.."); conn = DriverManager.getConnection("jdbc:mysql://172.16.0.169:3306/test?" + "user=testuser&password=test123"); System.out.println("Connection build.."); } catch (Exception ex) { System.out.println(ex); } return conn; } public static void closeConnection(Connection con) { try { if (con != null) con.close(); } catch (SQLException e) { e.printStackTrace(); } } } On Mon, Dec 12, 2011 at 2:57 PM, Brian Lamb wrote: > Thanks all. Erick, is there documentation on doing things with SolrJ and a > JDBC connection? > > On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson > wrote: > >> You might want to consider just doing the whole >> thing in SolrJ with a JDBC connection. When things >> get complex, it's sometimes more straightforward. >> >> Best >> Erick... >> >> P.S. Yes, it's pretty standard to have a single >> field be the destination for several copyField >> directives. >> >> On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty wrote: >> > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb >> > wrote: &g
Re: MySQL data import
Thanks all. Erick, is there documentation on doing things with SolrJ and a JDBC connection? On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson wrote: > You might want to consider just doing the whole > thing in SolrJ with a JDBC connection. When things > get complex, it's sometimes more straightforward. > > Best > Erick... > > P.S. Yes, it's pretty standard to have a single > field be the destination for several copyField > directives. > > On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty wrote: > > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb > > wrote: > >> Hi all, > >> > >> I have a few questions about how the MySQL data import works. It seems > it > >> creates a separate connection for each entity I create. Is there any > way to > >> avoid this? > > > > Not sure, but I do not think that it is possible. However, from your > description > > below, I think that you are unnecessarily multiplying entities. > > > >> By nature of my schema, I have several multivalued fields. Each one I > >> populate with a separate entity. Is there a better way to do it? For > >> example, could I pull in all the singular data in one sitting and then > come > >> back in later and populate with the multivalued items. > > > > Not quite sure as to what you mean. Would it be possible for you > > to post your schema.xml, and the DIH configuration file? Preferably, > > put these on pastebin.com, and send us links. Also, you should > > obfuscate details like access passwords. > > > >> An alternate approach in some cases would be to do a GROUP_CONCAT and > then > >> populate the multivalued column with some transformation. Is that > possible? > > [...] > > > > This is how we have been handling it. A complete description would > > be long, but here is the gist of it: > > * A transformer will be needed. In this case, we found it easiest > > to use a Java-based transformer. Thus, your entity should include > > something like > > > transformer="com.mycompany.search.solr.handler.JobsNumericTransformer...> > > ... > > > > Here, the class name to be used for the transformer attribute follows > > the usual Java rules, and the .jar needs to be made available to Solr. > > * The SELECT statement for the entity looks something like > > select group_concat( myfield SEPARATOR '@||@')... > > The separator should be something that does not occur in your > > normal data stream. > > * Within the entity, define > > > > * There are complications involved if NULL values are allowed > > for the field, in which case you would need to use COALESCE, > > maybe along with CAST > > * The transformer would look up "myfield", split along the separator, > > and populate the multi-valued field. > > > > This *is* a little complicated, so I would also like to hear about > > possible alternatives. > > > > Regards, > > Gora >
Re: MySQL data import
You might want to consider just doing the whole thing in SolrJ with a JDBC connection. When things get complex, it's sometimes more straightforward. Best Erick... P.S. Yes, it's pretty standard to have a single field be the destination for several copyField directives. On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty wrote: > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb > wrote: >> Hi all, >> >> I have a few questions about how the MySQL data import works. It seems it >> creates a separate connection for each entity I create. Is there any way to >> avoid this? > > Not sure, but I do not think that it is possible. However, from your > description > below, I think that you are unnecessarily multiplying entities. > >> By nature of my schema, I have several multivalued fields. Each one I >> populate with a separate entity. Is there a better way to do it? For >> example, could I pull in all the singular data in one sitting and then come >> back in later and populate with the multivalued items. > > Not quite sure as to what you mean. Would it be possible for you > to post your schema.xml, and the DIH configuration file? Preferably, > put these on pastebin.com, and send us links. Also, you should > obfuscate details like access passwords. > >> An alternate approach in some cases would be to do a GROUP_CONCAT and then >> populate the multivalued column with some transformation. Is that possible? > [...] > > This is how we have been handling it. A complete description would > be long, but here is the gist of it: > * A transformer will be needed. In this case, we found it easiest > to use a Java-based transformer. Thus, your entity should include > something like > transformer="com.mycompany.search.solr.handler.JobsNumericTransformer...> > ... > > Here, the class name to be used for the transformer attribute follows > the usual Java rules, and the .jar needs to be made available to Solr. > * The SELECT statement for the entity looks something like > select group_concat( myfield SEPARATOR '@||@')... > The separator should be something that does not occur in your > normal data stream. > * Within the entity, define > > * There are complications involved if NULL values are allowed > for the field, in which case you would need to use COALESCE, > maybe along with CAST > * The transformer would look up "myfield", split along the separator, > and populate the multi-valued field. > > This *is* a little complicated, so I would also like to hear about > possible alternatives. > > Regards, > Gora
Re: MySQL data import
On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb wrote: > Hi all, > > I have a few questions about how the MySQL data import works. It seems it > creates a separate connection for each entity I create. Is there any way to > avoid this? Not sure, but I do not think that it is possible. However, from your description below, I think that you are unnecessarily multiplying entities. > By nature of my schema, I have several multivalued fields. Each one I > populate with a separate entity. Is there a better way to do it? For > example, could I pull in all the singular data in one sitting and then come > back in later and populate with the multivalued items. Not quite sure as to what you mean. Would it be possible for you to post your schema.xml, and the DIH configuration file? Preferably, put these on pastebin.com, and send us links. Also, you should obfuscate details like access passwords. > An alternate approach in some cases would be to do a GROUP_CONCAT and then > populate the multivalued column with some transformation. Is that possible? [...] This is how we have been handling it. A complete description would be long, but here is the gist of it: * A transformer will be needed. In this case, we found it easiest to use a Java-based transformer. Thus, your entity should include something like * There are complications involved if NULL values are allowed for the field, in which case you would need to use COALESCE, maybe along with CAST * The transformer would look up "myfield", split along the separator, and populate the multi-valued field. This *is* a little complicated, so I would also like to hear about possible alternatives. Regards, Gora
Re: MySQL data import
Hi all, Any tips on this one? Thanks, Brian Lamb On Sun, Dec 11, 2011 at 3:54 PM, Brian Lamb wrote: > Hi all, > > I have a few questions about how the MySQL data import works. It seems it > creates a separate connection for each entity I create. Is there any way to > avoid this? > > By nature of my schema, I have several multivalued fields. Each one I > populate with a separate entity. Is there a better way to do it? For > example, could I pull in all the singular data in one sitting and then come > back in later and populate with the multivalued items. > > An alternate approach in some cases would be to do a GROUP_CONCAT and then > populate the multivalued column with some transformation. Is that possible? > > Lastly, is it possible to use copyField to copy three regular fields into > one multiValued field and have all the data show up? > > Thanks, > > Brian Lamb >
MySQL data import
Hi all, I have a few questions about how the MySQL data import works. It seems it creates a separate connection for each entity I create. Is there any way to avoid this? By nature of my schema, I have several multivalued fields. Each one I populate with a separate entity. Is there a better way to do it? For example, could I pull in all the singular data in one sitting and then come back in later and populate with the multivalued items. An alternate approach in some cases would be to do a GROUP_CONCAT and then populate the multivalued column with some transformation. Is that possible? Lastly, is it possible to use copyField to copy three regular fields into one multiValued field and have all the data show up? Thanks, Brian Lamb