Re: Mysql data import issue

2016-01-29 Thread vsriram30
Thanks Gora for your Suggestions. Since my table contains lot of fields, and
all the other fields have same name mapping for solr and mysql, I thought I
can give mapping for the one which is different and leave remaining as is.
But is not selecting the id field in the returned query the only way to
achieve that?

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998p4254210.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Mysql data import issue

2016-01-28 Thread Gora Mohanty
On 29 January 2016 at 04:13, vsriram30  wrote:

> Hi,
> I am using Solr 4.6.1 and I am trying to import my data from mysql to solr.
>
> In mysql, I have a table with columns,
> id, legacyid, otherfields...
>
[...]

> But still I get mysql id field to solr id field mapping. Please let me know
> how to prevent this from happening.
>

How about if you do not select the mysql "id" field in the query attribute
for the entity?

Regards,
Gora


Mysql data import issue

2016-01-28 Thread vsriram30
Hi,
I am using Solr 4.6.1 and I am trying to import my data from mysql to solr.

In mysql, I have a table with columns,
id, legacyid, otherfields...

In solr I have columns : id, other fields. I want to map the legacyid field
in my mysql table with Solr'r id column and skip the "id" field of mysql
while doing import. Hence I have a mapping,


But still I get one to one mapping of my mysql id field to solr's id field.
Can you please let me know how to prevent this from happening?

I even mapped id field of mysql to empty solr field. 

But still I get mysql id field to solr id field mapping. Please let me know
how to prevent this from happening. 

Thanks,
Sriram



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Mysql-data-import-issue-tp4253998.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: MySQL Data import handler

2013-09-14 Thread Amit Jha
Hi Baskar,

Just create a single schema.xml which should contains required fields from 3 
tables.

Add a status column to child table.i.e 
1 = add
2 = update
3 = delete
4 = indexed
Etc

Write a program using solrj which will read the status and do thing 
accordingly. 
 

Rgds
AJ

On 15-Sep-2013, at 5:46, Baskar Sikkayan  wrote:

> Hi,
>  If i am supposed to go with Java client, should i still do any
> configurations in solrconfig.xml or schema.xml.
> 
> Thanks,
> Baskar.S
> 
> 
> On Sat, Sep 14, 2013 at 8:46 PM, Gora Mohanty  wrote:
> 
>> On 14 September 2013 20:07, Baskar Sikkayan  wrote:
>>> Hi Gora,
>>>Thanks a lot for your reply.
>>> My requirement is to combine 3 tables in mysql for search operation and
>>> planning to sync these 3 tables( not all the columns ) in Apache Solr.
>>> Whenever there is any change( adding a new row, deleting a row, modifying
>>> the column data( any column in the 3 tables ) ), the same has to updated
>> in
>>> solr. Guess, for this requirement, instead of going with delta-import,
>>> Apachae Solar java client will be of useful.
>> [...]
>> 
>> Yes, if you are comfortable with programming in Java,
>> the Solr client would be a good alternative, though the
>> DataImportHandler can also do what you want.
>> 
>> Regards,
>> Gora
>> 


Re: MySQL Data import handler

2013-09-14 Thread Baskar Sikkayan
Hi,
  If i am supposed to go with Java client, should i still do any
configurations in solrconfig.xml or schema.xml.

Thanks,
Baskar.S


On Sat, Sep 14, 2013 at 8:46 PM, Gora Mohanty  wrote:

> On 14 September 2013 20:07, Baskar Sikkayan  wrote:
> > Hi Gora,
> > Thanks a lot for your reply.
> > My requirement is to combine 3 tables in mysql for search operation and
> > planning to sync these 3 tables( not all the columns ) in Apache Solr.
> > Whenever there is any change( adding a new row, deleting a row, modifying
> > the column data( any column in the 3 tables ) ), the same has to updated
> in
> > solr. Guess, for this requirement, instead of going with delta-import,
> > Apachae Solar java client will be of useful.
> [...]
>
> Yes, if you are comfortable with programming in Java,
> the Solr client would be a good alternative, though the
> DataImportHandler can also do what you want.
>
> Regards,
> Gora
>


Re: MySQL Data import handler

2013-09-14 Thread Gora Mohanty
On 14 September 2013 20:07, Baskar Sikkayan  wrote:
> Hi Gora,
> Thanks a lot for your reply.
> My requirement is to combine 3 tables in mysql for search operation and
> planning to sync these 3 tables( not all the columns ) in Apache Solr.
> Whenever there is any change( adding a new row, deleting a row, modifying
> the column data( any column in the 3 tables ) ), the same has to updated in
> solr. Guess, for this requirement, instead of going with delta-import,
> Apachae Solar java client will be of useful.
[...]

Yes, if you are comfortable with programming in Java,
the Solr client would be a good alternative, though the
DataImportHandler can also do what you want.

Regards,
Gora


Re: MySQL Data import handler

2013-09-14 Thread Baskar Sikkayan
Hi Gora,
Thanks a lot for your reply.
My requirement is to combine 3 tables in mysql for search operation and
planning to sync these 3 tables( not all the columns ) in Apache Solr.
Whenever there is any change( adding a new row, deleting a row, modifying
the column data( any column in the 3 tables ) ), the same has to updated in
solr. Guess, for this requirement, instead of going with delta-import,
Apachae Solar java client will be of useful.

Could you please share your view?

Thanks,
Baskar.S


On Sat, Sep 14, 2013 at 7:25 PM, Gora Mohanty  wrote:

> On 14 September 2013 18:46, Baskar Sikkayan  wrote:
> > Hi,
> >   I am new to solar and trying for MySQL data import handler.
> >
> > I have 3 tables in mysql.
> [...]
>
> Your question is unclear, and you would probably benefit
> by doing some basic homework on Solr and importing
> data into it. Please start from the Solr Wiki:
> http://wiki.apache.org/solr/
>
> For indexing from a database, you can use the DataImportHandler:
> http://wiki.apache.org/solr/DataImportHandler
> http://wiki.apache.org/solr/DIHQuickStart
> What you want should be pretty straightforward. After
> setting up basic data import, please refer to delta-import
> for adding new data to Solr:
> http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command
>
> Regards,
> Gora
>


Re: MySQL Data import handler

2013-09-14 Thread Gora Mohanty
On 14 September 2013 18:46, Baskar Sikkayan  wrote:
> Hi,
>   I am new to solar and trying for MySQL data import handler.
>
> I have 3 tables in mysql.
[...]

Your question is unclear, and you would probably benefit
by doing some basic homework on Solr and importing
data into it. Please start from the Solr Wiki:
http://wiki.apache.org/solr/

For indexing from a database, you can use the DataImportHandler:
http://wiki.apache.org/solr/DataImportHandler
http://wiki.apache.org/solr/DIHQuickStart
What you want should be pretty straightforward. After
setting up basic data import, please refer to delta-import
for adding new data to Solr:
http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command

Regards,
Gora


MySQL Data import handler

2013-09-14 Thread Baskar Sikkayan
Hi,
  I am new to solar and trying for MySQL data import handler.

I have 3 tables in mysql.

1) user
2) mast_data
3) child_data

The child_data table has foreign key ref of master_data and user.
In child_data table, i have the following columns.

1) date
2) fee
3) location
4) type
5) text
6) user_id
7) master_data_id

In the user table,  display name field might get changed at any time.
In the master table certain fields must be indexed in solar.

So combining these 3 tables and trying to index . this is for search
page ...

the data will get added to these tables at any time and also the fields
might get changed at any time 

I am not sure, how to keep the solr updated whenever there is any change in
the table ...

Also, looking for deleting the index in solr based on the date field 

Please help me on this 


Thanks,
Baskar.S


Re: MySQL data import

2011-12-13 Thread Shawn Heisey

On 12/11/2011 1:54 PM, Brian Lamb wrote:

By nature of my schema, I have several multivalued fields. Each one I
populate with a separate entity. Is there a better way to do it? For
example, could I pull in all the singular data in one sitting and then come
back in later and populate with the multivalued items.

An alternate approach in some cases would be to do a GROUP_CONCAT and then
populate the multivalued column with some transformation. Is that possible?

Lastly, is it possible to use copyField to copy three regular fields into
one multiValued field and have all the data show up?


The best way to proceed may depend on whether you actually need the 
field to be multivalued (returning an array in search results), or if 
you simply need to be able to search on all the values.  For me, it's 
the latter - the field isn't stored.


I use the GROUP_CONCAT method (hidden in a database view, so Solr 
doesn't need to know about it) to put multiple values into a field, 
separated by semicolons.  I then use the following single-valued 
fieldType to split those up and make all the values searchable.  The 
tokenizer splits by semicolons followed by zero or more spaces, the 
pattern filter strips leading and trailing punctuation from each token.  
The ICU filter is basically a better implementation of the ascii folding 
filter and the lowercase filter, in a single pass.  The others are 
fairly self-explanatory:



positionIncrementGap="0" omitNorms="true">










If you actually do need the field to be multivalued, then you'll need to 
do dataimport transformation as mentioned by Gora, who also replied.


Thanks,
Shawn



Re: MySQL data import

2011-12-12 Thread Erick Erickson
Here's a quick demo I wrote at one point. I haven't run it in a while,
but you should be able to get the idea.


package jdbc;


import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.common.SolrInputDocument;

import java.io.IOException;
import java.sql.*;
import java.util.ArrayList;
import java.util.Collection;


public class Indexer {
  public static void main(String[] args) {
startIndex("http://localhost:8983/solr";);
  }

  private static void startIndex(String url) {
Connection con = DataSource.getConnection();
try {

  long start = System.currentTimeMillis();
  // Create a multi-threaded communications channel to the Solr
server. Full interface (3.3) at:
  // 
http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html
  StreamingUpdateSolrServer server = new
StreamingUpdateSolrServer(url, 10, 4);

  // You may want to set these timeouts higer, Solr occasionally
will have long pauses while
  // segments merge.
  server.setSoTimeout(1000);  // socket read timeout
  server.setConnectionTimeout(100);
  //server.setDefaultMaxConnectionsPerHost(100);
  //server.setMaxTotalConnections(100);
  //server.setFollowRedirects(false);  // defaults to false
  // allowCompression defaults to false.
  // Server side must support gzip or deflate for this to have any effect.
  //server.setAllowCompression(true);
  server.setMaxRetries(1); // defaults to 0.  > 1 not recommended.
  server.setParser(new XMLResponseParser()); // binary parser is
used by default

  doDocuments(server, con);
  server.commit(); // Only needs to be done at the end, autocommit
or commitWithin should
  // do the rest.
  long endTime = System.currentTimeMillis();
  System.out.println("Total Time Taken->" + (endTime - start) + " mils");

} catch (Exception e) {
  e.printStackTrace();
  String msg = e.getMessage();
  System.out.println(msg);
}
  }

  private static void doDocuments(StreamingUpdateSolrServer server,
Connection con) throws SQLException, IOException, SolrServerException
{

Statement st = con.createStatement();
ResultSet rs = st.executeQuery("select id,title,text from test");

// SolrInputDocument interface (3.3) at
// 
http://lucene.apache.org/solr/api/org/apache/solr/common/SolrInputDocument.html
Collection docs = new ArrayList();
int total = 0;
int counter = 0;

while (rs.next()) {
  SolrInputDocument doc = new SolrInputDocument(); // DO NOT move
this outside the while loop
  // or be sure to call doc.clear()

  String id = rs.getString("id");
  String title = rs.getString("title");
  String text = rs.getString("text");

  doc.addField("id", id);
  doc.addField("title", title);
  doc.addField("text", text);

  docs.add(doc);
  ++counter;
  ++total;
  if (counter > 1000) { // Completely arbitrary, just batch up
more than one document for throughput!
server.add(docs);
docs.clear();
counter = 0;
  }
}
System.out.println("Total " + total + " Docs added succesfully");

  }
}

// Trivial class showing connecting to a MySql database server via jdbc...
class DataSource {
  public static Connection getConnection() {
Connection conn = null;
try {

  Class.forName("com.mysql.jdbc.Driver").newInstance();
  System.out.println("Driver Loaded..");
  conn = DriverManager.getConnection("jdbc:mysql://172.16.0.169:3306/test?"
+ "user=testuser&password=test123");
  System.out.println("Connection build..");
} catch (Exception ex) {
  System.out.println(ex);
}
return conn;
  }

  public static void closeConnection(Connection con) {
try {
  if (con != null)
con.close();
} catch (SQLException e) {
  e.printStackTrace();
}
  }
}

On Mon, Dec 12, 2011 at 2:57 PM, Brian Lamb
 wrote:
> Thanks all. Erick, is there documentation on doing things with SolrJ and a
> JDBC connection?
>
> On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson 
> wrote:
>
>> You might want to consider just doing the whole
>> thing in SolrJ with a JDBC connection. When things
>> get complex, it's sometimes more straightforward.
>>
>> Best
>> Erick...
>>
>> P.S. Yes, it's pretty standard to have a single
>> field be the destination for several copyField
>> directives.
>>
>> On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty  wrote:
>> > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
>> >  wrote:
&g

Re: MySQL data import

2011-12-12 Thread Brian Lamb
Thanks all. Erick, is there documentation on doing things with SolrJ and a
JDBC connection?

On Mon, Dec 12, 2011 at 1:34 PM, Erick Erickson wrote:

> You might want to consider just doing the whole
> thing in SolrJ with a JDBC connection. When things
> get complex, it's sometimes more straightforward.
>
> Best
> Erick...
>
> P.S. Yes, it's pretty standard to have a single
> field be the destination for several copyField
> directives.
>
> On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty  wrote:
> > On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
> >  wrote:
> >> Hi all,
> >>
> >> I have a few questions about how the MySQL data import works. It seems
> it
> >> creates a separate connection for each entity I create. Is there any
> way to
> >> avoid this?
> >
> > Not sure, but I do not think that it is possible. However, from your
> description
> > below, I think that you are unnecessarily multiplying entities.
> >
> >> By nature of my schema, I have several multivalued fields. Each one I
> >> populate with a separate entity. Is there a better way to do it? For
> >> example, could I pull in all the singular data in one sitting and then
> come
> >> back in later and populate with the multivalued items.
> >
> > Not quite sure as to what you mean. Would it be possible for you
> > to post your schema.xml, and the DIH configuration file? Preferably,
> > put these on pastebin.com, and send us links. Also, you should
> > obfuscate details like access passwords.
> >
> >> An alternate approach in some cases would be to do a GROUP_CONCAT and
> then
> >> populate the multivalued column with some transformation. Is that
> possible?
> > [...]
> >
> > This is how we have been handling it. A complete description would
> > be long, but here is the gist of it:
> > * A transformer will be needed. In this case, we found it easiest
> >  to use a Java-based transformer. Thus, your entity should include
> >  something like
> >   > transformer="com.mycompany.search.solr.handler.JobsNumericTransformer...>
> >  ...
> >  
> >  Here, the class name to be used for the transformer attribute follows
> >  the usual Java rules, and the .jar needs to be made available to Solr.
> > * The SELECT statement for the entity looks something like
> >  select group_concat( myfield SEPARATOR '@||@')...
> >  The separator should be something that does not occur in your
> >  normal data stream.
> > * Within the entity, define
> >   
> > * There are complications involved if NULL values are allowed
> >   for the field, in which case you would need to use COALESCE,
> >   maybe along with CAST
> > * The transformer would look up "myfield", split along the separator,
> >   and populate the multi-valued field.
> >
> > This *is* a little complicated, so I would also like to hear about
> > possible alternatives.
> >
> > Regards,
> > Gora
>


Re: MySQL data import

2011-12-12 Thread Erick Erickson
You might want to consider just doing the whole
thing in SolrJ with a JDBC connection. When things
get complex, it's sometimes more straightforward.

Best
Erick...

P.S. Yes, it's pretty standard to have a single
field be the destination for several copyField
directives.

On Mon, Dec 12, 2011 at 12:48 PM, Gora Mohanty  wrote:
> On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
>  wrote:
>> Hi all,
>>
>> I have a few questions about how the MySQL data import works. It seems it
>> creates a separate connection for each entity I create. Is there any way to
>> avoid this?
>
> Not sure, but I do not think that it is possible. However, from your 
> description
> below, I think that you are unnecessarily multiplying entities.
>
>> By nature of my schema, I have several multivalued fields. Each one I
>> populate with a separate entity. Is there a better way to do it? For
>> example, could I pull in all the singular data in one sitting and then come
>> back in later and populate with the multivalued items.
>
> Not quite sure as to what you mean. Would it be possible for you
> to post your schema.xml, and the DIH configuration file? Preferably,
> put these on pastebin.com, and send us links. Also, you should
> obfuscate details like access passwords.
>
>> An alternate approach in some cases would be to do a GROUP_CONCAT and then
>> populate the multivalued column with some transformation. Is that possible?
> [...]
>
> This is how we have been handling it. A complete description would
> be long, but here is the gist of it:
> * A transformer will be needed. In this case, we found it easiest
>  to use a Java-based transformer. Thus, your entity should include
>  something like
>   transformer="com.mycompany.search.solr.handler.JobsNumericTransformer...>
>  ...
>  
>  Here, the class name to be used for the transformer attribute follows
>  the usual Java rules, and the .jar needs to be made available to Solr.
> * The SELECT statement for the entity looks something like
>  select group_concat( myfield SEPARATOR '@||@')...
>  The separator should be something that does not occur in your
>  normal data stream.
> * Within the entity, define
>   
> * There are complications involved if NULL values are allowed
>   for the field, in which case you would need to use COALESCE,
>   maybe along with CAST
> * The transformer would look up "myfield", split along the separator,
>   and populate the multi-valued field.
>
> This *is* a little complicated, so I would also like to hear about
> possible alternatives.
>
> Regards,
> Gora


Re: MySQL data import

2011-12-12 Thread Gora Mohanty
On Mon, Dec 12, 2011 at 2:24 AM, Brian Lamb
 wrote:
> Hi all,
>
> I have a few questions about how the MySQL data import works. It seems it
> creates a separate connection for each entity I create. Is there any way to
> avoid this?

Not sure, but I do not think that it is possible. However, from your description
below, I think that you are unnecessarily multiplying entities.

> By nature of my schema, I have several multivalued fields. Each one I
> populate with a separate entity. Is there a better way to do it? For
> example, could I pull in all the singular data in one sitting and then come
> back in later and populate with the multivalued items.

Not quite sure as to what you mean. Would it be possible for you
to post your schema.xml, and the DIH configuration file? Preferably,
put these on pastebin.com, and send us links. Also, you should
obfuscate details like access passwords.

> An alternate approach in some cases would be to do a GROUP_CONCAT and then
> populate the multivalued column with some transformation. Is that possible?
[...]

This is how we have been handling it. A complete description would
be long, but here is the gist of it:
* A transformer will be needed. In this case, we found it easiest
  to use a Java-based transformer. Thus, your entity should include
  something like
  
* There are complications involved if NULL values are allowed
   for the field, in which case you would need to use COALESCE,
   maybe along with CAST
* The transformer would look up "myfield", split along the separator,
   and populate the multi-valued field.

This *is* a little complicated, so I would also like to hear about
possible alternatives.

Regards,
Gora


Re: MySQL data import

2011-12-12 Thread Brian Lamb
Hi all,

Any tips on this one?

Thanks,

Brian Lamb

On Sun, Dec 11, 2011 at 3:54 PM, Brian Lamb
wrote:

> Hi all,
>
> I have a few questions about how the MySQL data import works. It seems it
> creates a separate connection for each entity I create. Is there any way to
> avoid this?
>
> By nature of my schema, I have several multivalued fields. Each one I
> populate with a separate entity. Is there a better way to do it? For
> example, could I pull in all the singular data in one sitting and then come
> back in later and populate with the multivalued items.
>
> An alternate approach in some cases would be to do a GROUP_CONCAT and then
> populate the multivalued column with some transformation. Is that possible?
>
> Lastly, is it possible to use copyField to copy three regular fields into
> one multiValued field and have all the data show up?
>
> Thanks,
>
> Brian Lamb
>


MySQL data import

2011-12-11 Thread Brian Lamb
Hi all,

I have a few questions about how the MySQL data import works. It seems it
creates a separate connection for each entity I create. Is there any way to
avoid this?

By nature of my schema, I have several multivalued fields. Each one I
populate with a separate entity. Is there a better way to do it? For
example, could I pull in all the singular data in one sitting and then come
back in later and populate with the multivalued items.

An alternate approach in some cases would be to do a GROUP_CONCAT and then
populate the multivalued column with some transformation. Is that possible?

Lastly, is it possible to use copyField to copy three regular fields into
one multiValued field and have all the data show up?

Thanks,

Brian Lamb