Re: Mysql data import issue

2016-01-29 Thread vsriram30
Thanks Gora for your Suggestions. Since my table contains lot of fields, and
all the other fields have same name mapping for solr and mysql, I thought I
can give mapping for the one which is different and leave remaining as is.
But is not selecting the id field in the returned query the only way to
achieve that?

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998p4254210.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mysql data import issue

2016-01-28 Thread vsriram30
Hi,
I am using Solr 4.6.1 and I am trying to import my data from mysql to solr.

In mysql, I have a table with columns,
id, legacyid, otherfields...

In solr I have columns : id, other fields. I want to map the legacyid field
in my mysql table with Solr'r id column and skip the "id" field of mysql
while doing import. Hence I have a mapping,


But still I get one to one mapping of my mysql id field to solr's id field.
Can you please let me know how to prevent this from happening?

I even mapped id field of mysql to empty solr field. 

But still I get mysql id field to solr id field mapping. Please let me know
how to prevent this from happening. 

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Mysql data import issue

2016-01-28 Thread Gora Mohanty
On 29 January 2016 at 04:13, vsriram30  wrote:

> Hi,
> I am using Solr 4.6.1 and I am trying to import my data from mysql to solr.
>
> In mysql, I have a table with columns,
> id, legacyid, otherfields...
>
[...]

> But still I get mysql id field to solr id field mapping. Please let me know
> how to prevent this from happening.
>

How about if you do not select the mysql "id" field in the query attribute
for the entity?

Regards,
Gora


MySQL Data import handler

2013-09-14 Thread Baskar Sikkayan
Hi,
  I am new to solar and trying for MySQL data import handler.

I have 3 tables in mysql.

1) user
2) mast_data
3) child_data

The child_data table has foreign key ref of master_data and user.
In child_data table, i have the following columns.

1) date
2) fee
3) location
4) type
5) text
6) user_id
7) master_data_id

In the user table,  display name field might get changed at any time.
In the master table certain fields must be indexed in solar.

So combining these 3 tables and trying to index . this is for search
page ...

the data will get added to these tables at any time and also the fields
might get changed at any time 

I am not sure, how to keep the solr updated whenever there is any change in
the table ...

Also, looking for deleting the index in solr based on the date field 

Please help me on this 


Thanks,
Baskar.S


Re: MySQL Data import handler

2013-09-14 Thread Gora Mohanty
On 14 September 2013 18:46, Baskar Sikkayan baskar@gmail.com wrote:
 Hi,
   I am new to solar and trying for MySQL data import handler.

 I have 3 tables in mysql.
[...]

Your question is unclear, and you would probably benefit
by doing some basic homework on Solr and importing
data into it. Please start from the Solr Wiki:
http://wiki.apache.org/solr/

For indexing from a database, you can use the DataImportHandler:
http://wiki.apache.org/solr/DataImportHandler
http://wiki.apache.org/solr/DIHQuickStart
What you want should be pretty straightforward. After
setting up basic data import, please refer to delta-import
for adding new data to Solr:
http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command

Regards,
Gora


Re: MySQL Data import handler

2013-09-14 Thread Baskar Sikkayan
Hi Gora,
Thanks a lot for your reply.
My requirement is to combine 3 tables in mysql for search operation and
planning to sync these 3 tables( not all the columns ) in Apache Solr.
Whenever there is any change( adding a new row, deleting a row, modifying
the column data( any column in the 3 tables ) ), the same has to updated in
solr. Guess, for this requirement, instead of going with delta-import,
Apachae Solar java client will be of useful.

Could you please share your view?

Thanks,
Baskar.S


On Sat, Sep 14, 2013 at 7:25 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 14 September 2013 18:46, Baskar Sikkayan baskar@gmail.com wrote:
  Hi,
I am new to solar and trying for MySQL data import handler.
 
  I have 3 tables in mysql.
 [...]

 Your question is unclear, and you would probably benefit
 by doing some basic homework on Solr and importing
 data into it. Please start from the Solr Wiki:
 http://wiki.apache.org/solr/

 For indexing from a database, you can use the DataImportHandler:
 http://wiki.apache.org/solr/DataImportHandler
 http://wiki.apache.org/solr/DIHQuickStart
 What you want should be pretty straightforward. After
 setting up basic data import, please refer to delta-import
 for adding new data to Solr:
 http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command

 Regards,
 Gora



Re: MySQL Data import handler

2013-09-14 Thread Gora Mohanty
On 14 September 2013 20:07, Baskar Sikkayan baskar@gmail.com wrote:
 Hi Gora,
 Thanks a lot for your reply.
 My requirement is to combine 3 tables in mysql for search operation and
 planning to sync these 3 tables( not all the columns ) in Apache Solr.
 Whenever there is any change( adding a new row, deleting a row, modifying
 the column data( any column in the 3 tables ) ), the same has to updated in
 solr. Guess, for this requirement, instead of going with delta-import,
 Apachae Solar java client will be of useful.
[...]

Yes, if you are comfortable with programming in Java,
the Solr client would be a good alternative, though the
DataImportHandler can also do what you want.

Regards,
Gora


Re: MySQL Data import handler

2013-09-14 Thread Baskar Sikkayan
Hi,
  If i am supposed to go with Java client, should i still do any
configurations in solrconfig.xml or schema.xml.

Thanks,
Baskar.S


On Sat, Sep 14, 2013 at 8:46 PM, Gora Mohanty g...@mimirtech.com wrote:

 On 14 September 2013 20:07, Baskar Sikkayan baskar@gmail.com wrote:
  Hi Gora,
  Thanks a lot for your reply.
  My requirement is to combine 3 tables in mysql for search operation and
  planning to sync these 3 tables( not all the columns ) in Apache Solr.
  Whenever there is any change( adding a new row, deleting a row, modifying
  the column data( any column in the 3 tables ) ), the same has to updated
 in
  solr. Guess, for this requirement, instead of going with delta-import,
  Apachae Solar java client will be of useful.
 [...]

 Yes, if you are comfortable with programming in Java,
 the Solr client would be a good alternative, though the
 DataImportHandler can also do what you want.

 Regards,
 Gora



Re: MySQL Data import handler

2013-09-14 Thread Amit Jha
Hi Baskar,

Just create a single schema.xml which should contains required fields from 3 
tables.

Add a status column to child table.i.e 
1 = add
2 = update
3 = delete
4 = indexed
Etc

Write a program using solrj which will read the status and do thing 
accordingly. 
 

Rgds
AJ

On 15-Sep-2013, at 5:46, Baskar Sikkayan baskar@gmail.com wrote:

 Hi,
  If i am supposed to go with Java client, should i still do any
 configurations in solrconfig.xml or schema.xml.
 
 Thanks,
 Baskar.S
 
 
 On Sat, Sep 14, 2013 at 8:46 PM, Gora Mohanty g...@mimirtech.com wrote:
 
 On 14 September 2013 20:07, Baskar Sikkayan baskar@gmail.com wrote:
 Hi Gora,
Thanks a lot for your reply.
 My requirement is to combine 3 tables in mysql for search operation and
 planning to sync these 3 tables( not all the columns ) in Apache Solr.
 Whenever there is any change( adding a new row, deleting a row, modifying
 the column data( any column in the 3 tables ) ), the same has to updated
 in
 solr. Guess, for this requirement, instead of going with delta-import,
 Apachae Solar java client will be of useful.
 [...]
 
 Yes, if you are comfortable with programming in Java,
 the Solr client would be a good alternative, though the
 DataImportHandler can also do what you want.
 
 Regards,
 Gora
 


Re: MySQL data import

2011-12-13 Thread Shawn Heisey

On 12/11/2011 1:54 PM, Brian Lamb wrote:

By nature of my schema, I have several multivalued fields. Each one I
populate with a separate entity. Is there a better way to do it? For
example, could I pull in all the singular data in one sitting and then come
back in later and populate with the multivalued items.

An alternate approach in some cases would be to do a GROUP_CONCAT and then
populate the multivalued column with some transformation. Is that possible?

Lastly, is it possible to use copyField to copy three regular fields into
one multiValued field and have all the data show up?


The best way to proceed may depend on whether you actually need the 
field to be multivalued (returning an array in search results), or if 
you simply need to be able to search on all the values.  For me, it's 
the latter - the field isn't stored.


I use the GROUP_CONCAT method (hidden in a database view, so Solr 
doesn't need to know about it) to put multiple values into a field, 
separated by semicolons.  I then use the following single-valued 
fieldType to split those up and make all the values searchable.  The 
tokenizer splits by semicolons followed by zero or more spaces, the 
pattern filter strips leading and trailing punctuation from each token.  
The ICU filter is basically a better implementation of the ascii folding 
filter and the lowercase filter, in a single pass.  The others are 
fairly self-explanatory:


!-- lowercases, tokenize by semicolons --
fieldType name=lcsemi class=solr.TextField sortMissingLast=true 
positionIncrementGap=0 omitNorms=true

analyzer
tokenizer class=solr.PatternTokenizerFactory pattern=; */
filter class=solr.PatternReplaceFilterFactory
  pattern=^(\p{Punct}*)(.*?)(\p{Punct}*)$
  replacement=$2
  allowempty=false
/
filter class=solr.ICUFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
filter class=solr.TrimFilterFactory/
/analyzer
/fieldType

If you actually do need the field to be multivalued, then you'll need to 
do dataimport transformation as mentioned by Gora, who also replied.


Thanks,
Shawn



Re: MySQL data import

2011-12-12 Thread Brian Lamb
Hi all,

Any tips on this one?

Thanks,

Brian Lamb

On Sun, Dec 11, 2011 at 3:54 PM, Brian Lamb
brian.l...@journalexperts.comwrote:

 Hi all,

 I have a few questions about how the MySQL data import works. It seems it
 creates a separate connection for each entity I create. Is there any way to
 avoid this?

 By nature of my schema, I have several multivalued fields. Each one I
 populate with a separate entity. Is there a better way to do it? For
 example, could I pull in all the singular data in one sitting and then come
 back in later and populate with the multivalued items.

 An alternate approach in some cases would be to do a GROUP_CONCAT and then
 populate the multivalued column with some transformation. Is that possible?

 Lastly, is it possible to use copyField to copy three regular fields into
 one multiValued field and have all the data show up?

 Thanks,

 Brian Lamb



Re: MySQL data import

2011-12-12 Thread Gora Mohanty
On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
brian.l...@journalexperts.com wrote:
 Hi all,

 I have a few questions about how the MySQL data import works. It seems it
 creates a separate connection for each entity I create. Is there any way to
 avoid this?

Not sure, but I do not think that it is possible. However, from your description
below, I think that you are unnecessarily multiplying entities.

 By nature of my schema, I have several multivalued fields. Each one I
 populate with a separate entity. Is there a better way to do it? For
 example, could I pull in all the singular data in one sitting and then come
 back in later and populate with the multivalued items.

Not quite sure as to what you mean. Would it be possible for you
to post your schema.xml, and the DIH configuration file? Preferably,
put these on pastebin.com, and send us links. Also, you should
obfuscate details like access passwords.

 An alternate approach in some cases would be to do a GROUP_CONCAT and then
 populate the multivalued column with some transformation. Is that possible?
[...]

This is how we have been handling it. A complete description would
be long, but here is the gist of it:
* A transformer will be needed. In this case, we found it easiest
  to use a Java-based transformer. Thus, your entity should include
  something like
  entity name=myname dataSource=mysource
transformer=com.mycompany.search.solr.handler.JobsNumericTransformer...
  ...
  /entity
 Here, the class name to be used for the transformer attribute follows
 the usual Java rules, and the .jar needs to be made available to Solr.
* The SELECT statement for the entity looks something like
  select group_concat( myfield SEPARATOR '@||@')...
  The separator should be something that does not occur in your
  normal data stream.
* Within the entity, define
   field column=myfield/
* There are complications involved if NULL values are allowed
   for the field, in which case you would need to use COALESCE,
   maybe along with CAST
* The transformer would look up myfield, split along the separator,
   and populate the multi-valued field.

This *is* a little complicated, so I would also like to hear about
possible alternatives.

Regards,
Gora


Re: MySQL data import

2011-12-12 Thread Erick Erickson
You might want to consider just doing the whole
thing in SolrJ with a JDBC connection. When things
get complex, it's sometimes more straightforward.

Best
Erick...

P.S. Yes, it's pretty standard to have a single
field be the destination for several copyField
directives.

On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty g...@mimirtech.com wrote:
 On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
 brian.l...@journalexperts.com wrote:
 Hi all,

 I have a few questions about how the MySQL data import works. It seems it
 creates a separate connection for each entity I create. Is there any way to
 avoid this?

 Not sure, but I do not think that it is possible. However, from your 
 description
 below, I think that you are unnecessarily multiplying entities.

 By nature of my schema, I have several multivalued fields. Each one I
 populate with a separate entity. Is there a better way to do it? For
 example, could I pull in all the singular data in one sitting and then come
 back in later and populate with the multivalued items.

 Not quite sure as to what you mean. Would it be possible for you
 to post your schema.xml, and the DIH configuration file? Preferably,
 put these on pastebin.com, and send us links. Also, you should
 obfuscate details like access passwords.

 An alternate approach in some cases would be to do a GROUP_CONCAT and then
 populate the multivalued column with some transformation. Is that possible?
 [...]

 This is how we have been handling it. A complete description would
 be long, but here is the gist of it:
 * A transformer will be needed. In this case, we found it easiest
  to use a Java-based transformer. Thus, your entity should include
  something like
  entity name=myname dataSource=mysource
 transformer=com.mycompany.search.solr.handler.JobsNumericTransformer...
  ...
  /entity
  Here, the class name to be used for the transformer attribute follows
  the usual Java rules, and the .jar needs to be made available to Solr.
 * The SELECT statement for the entity looks something like
  select group_concat( myfield SEPARATOR '@||@')...
  The separator should be something that does not occur in your
  normal data stream.
 * Within the entity, define
   field column=myfield/
 * There are complications involved if NULL values are allowed
   for the field, in which case you would need to use COALESCE,
   maybe along with CAST
 * The transformer would look up myfield, split along the separator,
   and populate the multi-valued field.

 This *is* a little complicated, so I would also like to hear about
 possible alternatives.

 Regards,
 Gora


Re: MySQL data import

2011-12-12 Thread Brian Lamb
Thanks all. Erick, is there documentation on doing things with SolrJ and a
JDBC connection?

On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson erickerick...@gmail.comwrote:

 You might want to consider just doing the whole
 thing in SolrJ with a JDBC connection. When things
 get complex, it's sometimes more straightforward.

 Best
 Erick...

 P.S. Yes, it's pretty standard to have a single
 field be the destination for several copyField
 directives.

 On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty g...@mimirtech.com wrote:
  On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
  brian.l...@journalexperts.com wrote:
  Hi all,
 
  I have a few questions about how the MySQL data import works. It seems
 it
  creates a separate connection for each entity I create. Is there any
 way to
  avoid this?
 
  Not sure, but I do not think that it is possible. However, from your
 description
  below, I think that you are unnecessarily multiplying entities.
 
  By nature of my schema, I have several multivalued fields. Each one I
  populate with a separate entity. Is there a better way to do it? For
  example, could I pull in all the singular data in one sitting and then
 come
  back in later and populate with the multivalued items.
 
  Not quite sure as to what you mean. Would it be possible for you
  to post your schema.xml, and the DIH configuration file? Preferably,
  put these on pastebin.com, and send us links. Also, you should
  obfuscate details like access passwords.
 
  An alternate approach in some cases would be to do a GROUP_CONCAT and
 then
  populate the multivalued column with some transformation. Is that
 possible?
  [...]
 
  This is how we have been handling it. A complete description would
  be long, but here is the gist of it:
  * A transformer will be needed. In this case, we found it easiest
   to use a Java-based transformer. Thus, your entity should include
   something like
   entity name=myname dataSource=mysource
  transformer=com.mycompany.search.solr.handler.JobsNumericTransformer...
   ...
   /entity
   Here, the class name to be used for the transformer attribute follows
   the usual Java rules, and the .jar needs to be made available to Solr.
  * The SELECT statement for the entity looks something like
   select group_concat( myfield SEPARATOR '@||@')...
   The separator should be something that does not occur in your
   normal data stream.
  * Within the entity, define
field column=myfield/
  * There are complications involved if NULL values are allowed
for the field, in which case you would need to use COALESCE,
maybe along with CAST
  * The transformer would look up myfield, split along the separator,
and populate the multi-valued field.
 
  This *is* a little complicated, so I would also like to hear about
  possible alternatives.
 
  Regards,
  Gora



Re: MySQL data import

2011-12-12 Thread Erick Erickson
Here's a quick demo I wrote at one point. I haven't run it in a while,
but you should be able to get the idea.


package jdbc;


import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.common.SolrInputDocument;

import java.io.IOException;
import java.sql.*;
import java.util.ArrayList;
import java.util.Collection;


public class Indexer {
  public static void main(String[] args) {
startIndex(http://localhost:8983/solr;);
  }

  private static void startIndex(String url) {
Connection con = DataSource.getConnection();
try {

  long start = System.currentTimeMillis();
  // Create a multi-threaded communications channel to the Solr
server. Full interface (3.3) at:
  // 
http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html
  StreamingUpdateSolrServer server = new
StreamingUpdateSolrServer(url, 10, 4);

  // You may want to set these timeouts higer, Solr occasionally
will have long pauses while
  // segments merge.
  server.setSoTimeout(1000);  // socket read timeout
  server.setConnectionTimeout(100);
  //server.setDefaultMaxConnectionsPerHost(100);
  //server.setMaxTotalConnections(100);
  //server.setFollowRedirects(false);  // defaults to false
  // allowCompression defaults to false.
  // Server side must support gzip or deflate for this to have any effect.
  //server.setAllowCompression(true);
  server.setMaxRetries(1); // defaults to 0.   1 not recommended.
  server.setParser(new XMLResponseParser()); // binary parser is
used by default

  doDocuments(server, con);
  server.commit(); // Only needs to be done at the end, autocommit
or commitWithin should
  // do the rest.
  long endTime = System.currentTimeMillis();
  System.out.println(Total Time Taken- + (endTime - start) +  mils);

} catch (Exception e) {
  e.printStackTrace();
  String msg = e.getMessage();
  System.out.println(msg);
}
  }

  private static void doDocuments(StreamingUpdateSolrServer server,
Connection con) throws SQLException, IOException, SolrServerException
{

Statement st = con.createStatement();
ResultSet rs = st.executeQuery(select id,title,text from test);

// SolrInputDocument interface (3.3) at
// 
http://lucene.apache.org/solr/api/org/apache/solr/common/SolrInputDocument.html
CollectionSolrInputDocument docs = new ArrayListSolrInputDocument();
int total = 0;
int counter = 0;

while (rs.next()) {
  SolrInputDocument doc = new SolrInputDocument(); // DO NOT move
this outside the while loop
  // or be sure to call doc.clear()

  String id = rs.getString(id);
  String title = rs.getString(title);
  String text = rs.getString(text);

  doc.addField(id, id);
  doc.addField(title, title);
  doc.addField(text, text);

  docs.add(doc);
  ++counter;
  ++total;
  if (counter  1000) { // Completely arbitrary, just batch up
more than one document for throughput!
server.add(docs);
docs.clear();
counter = 0;
  }
}
System.out.println(Total  + total +  Docs added succesfully);

  }
}

// Trivial class showing connecting to a MySql database server via jdbc...
class DataSource {
  public static Connection getConnection() {
Connection conn = null;
try {

  Class.forName(com.mysql.jdbc.Driver).newInstance();
  System.out.println(Driver Loaded..);
  conn = DriverManager.getConnection(jdbc:mysql://172.16.0.169:3306/test?
+ user=testuserpassword=test123);
  System.out.println(Connection build..);
} catch (Exception ex) {
  System.out.println(ex);
}
return conn;
  }

  public static void closeConnection(Connection con) {
try {
  if (con != null)
con.close();
} catch (SQLException e) {
  e.printStackTrace();
}
  }
}

On Mon, Dec 12, 2011 at 2:57 PM, Brian Lamb
brian.l...@journalexperts.com wrote:
 Thanks all. Erick, is there documentation on doing things with SolrJ and a
 JDBC connection?

 On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 You might want to consider just doing the whole
 thing in SolrJ with a JDBC connection. When things
 get complex, it's sometimes more straightforward.

 Best
 Erick...

 P.S. Yes, it's pretty standard to have a single
 field be the destination for several copyField
 directives.

 On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty g...@mimirtech.com wrote:
  On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
  brian.l...@journalexperts.com wrote:
  Hi all,
 
  I have a few questions about how the MySQL data import works. It seems
 it
  creates a separate connection for each entity I create. Is there any
 way to
  avoid this?
 
  Not sure, but I do not think that it is possible. However, from your

MySQL data import

2011-12-11 Thread Brian Lamb
Hi all,

I have a few questions about how the MySQL data import works. It seems it
creates a separate connection for each entity I create. Is there any way to
avoid this?

By nature of my schema, I have several multivalued fields. Each one I
populate with a separate entity. Is there a better way to do it? For
example, could I pull in all the singular data in one sitting and then come
back in later and populate with the multivalued items.

An alternate approach in some cases would be to do a GROUP_CONCAT and then
populate the multivalued column with some transformation. Is that possible?

Lastly, is it possible to use copyField to copy three regular fields into
one multiValued field and have all the data show up?

Thanks,

Brian Lamb