Re: [sqoop-2639] patch for sqoop-2639

2018-09-17 Thread Fero Szabo
Hi,

Can you please link the review in the Jira?
(Or at least on this thread?)

Thanks,
Fero

On Fri, Sep 14, 2018 at 8:39 AM, CharSyam  wrote:

> Thanks Fero. I created review in ReviewBoard.
>
> 2018년 9월 14일 (금) 오후 8:34, Fero Szabo 님이 작성:
>
> > Hi,
> >
> > Thank you for your contribution!
> >
> > Since the Sqoop community is using ReviewBoard for reviews, the next step
> > would be to create a review there.
> >
> > This is how I explained this process on another Jira:
> > Please go to Review Board at https://reviews.apache.org/account/login/
> and
> > register if you haven't done so far. Then, create a patch by invoking
> *git
> > diff > SQOOP-2949
> > -1.patch on the
> command
> > line. Finally, create a review using the sqoop-trunk* repository and your
> > patch. Fill in the necessary fields, as for example, in this review:
> > https://reviews.apache.org/r/65607/  (no need for a description this
> long,
> > nobody likes to read this much   ).
> >
> > After your patch has been reviewed, hopefully, a committer or PMC will
> pick
> > it up and commit it.
> >
> > Best Regards,
> > Fero
> >
> >
> > On Fri, Sep 14, 2018 at 1:06 PM, CharSyam  wrote:
> >
> > > you can test adding --mysql-charset UTF-8 options in commandline
> > >
> > > 2018년 9월 14일 (금) 오후 8:05, CharSyam 님이 작성:
> > >
> > > > Hi, I made a patch for sqoop-2639
> > > > actually this is for 1.4.6 branch, I also made for trunk branch.
> > > >
> > > > How I can upload this patch or get review?
> > > >
> > > > Thanks. all
> > > >
> > >
> >
> >
> >
> > --
> > *Ferenc Szabo* | Software Engineer
> > t. (+361) 701 1201 <+361+701+1201>
> > cloudera.com 
> >
> > [image: Cloudera] 
> >
> > [image: Cloudera on Twitter]  [image:
> > Cloudera on Facebook]  [image:
> Cloudera
> > on LinkedIn] 
> > --
> >
>



-- 
*Ferenc Szabo* | Software Engineer
t. (+361) 701 1201 <+361+701+1201>
cloudera.com 

[image: Cloudera] 

[image: Cloudera on Twitter]  [image:
Cloudera on Facebook]  [image: Cloudera
on LinkedIn] 
--


[jira] [Commented] (SQOOP-2639) Unable to export utf-8 data to MySQL using --direct mode

2018-09-17 Thread Fero Szabo (JIRA)


[ 
https://issues.apache.org/jira/browse/SQOOP-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618300#comment-16618300
 ] 

Fero Szabo commented on SQOOP-2639:
---

Hi [~charsyam],

I presume you want to contribute to Sqoop? ;)

I cannot grant you any privileges, since I'm just a contributor myself, but in 
this case, you could ask to be added as contributor on the Sqoop-dev mailing 
list.

Best Regards,

Fero

> Unable to export utf-8 data to MySQL using --direct mode
> 
>
> Key: SQOOP-2639
> URL: https://issues.apache.org/jira/browse/SQOOP-2639
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
>Reporter: Ranjan Bagchi
>Priority: Major
> Attachments: sqoop-2639.patch
>
>
> I am able to import utf-8 data (non-latin1) data successfully into HDFS via:
> sqoop import --connect jdbc:mysql://host/db --username XX --password YY \
> --mysql-delimiters \
> --table MYSQL_SRC_TABLE --target-dir ${SQOOP_DIR_PREFIX}/mysql_table 
> --direct 
> However, using 
> sqoop export --connect  jdbc:mysql://host/db --username XX --password YY \
> --mysql-delimiters \
> --table MYSQL_DEST_TABLE --export-dir ${SQOOP_DIR_PREFIX}/mysql_table 
> \
> --direct 
> Cuts off the fields after the first non-latin1 character (eg a letter w/ an 
> umlaut).
> I tried other options like  -- --default-character-set=utf8, without success.
> I was able to fix the problem with the following change:
> Change 
> https://svn.apache.org/repos/asf/sqoop/trunk/src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java,
>  line 322 from 
> this.mysqlCharSet = MySQLUtils.MYSQL_DEFAULT_CHARSET;
> to
> this.mysqlCharSet = "utf-8"; 
> Hope this helps



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68717: [sqoop-2639] supporting mysql export with specific encoding with --mysql-charset option

2018-09-17 Thread Dae Myung Kang


> On Sept. 17, 2018, 7:08 a.m., Szabolcs Vasas wrote:
> > Hi Dae Myung,
> > 
> > Thank you for submitting this patch!
> > I think it is a very good start but there are a few improvements we should 
> > make before we can commit it.
> > 
> > - The JIRA says that this is an improvement for the MySQL direct connector 
> > but the change suggests that it affects the plain MySQL connector only. Can 
> > you please clarify this?
> > - Since the option you introduce only affects the MySQL connector I think 
> > it should be added as an extra argument to 
> > org.apache.sqoop.manager.MySQLManager. You can see a good example in 
> > org.apache.sqoop.manager.PostgresqlManager#parseExtraArgs.
> > - Apart from manual testing you should add automated tests too. I think 
> > creating a test file with UTF-8 encoding and special characters and 
> > exporting it into MySQL would be a great test case. You can see a few 
> > examples in org.apache.sqoop.manager.mysql.DirectMySQLExportTest and 
> > org.apache.sqoop.manager.mysql.JdbcMySQLExportTest.
> > 
> > If you need any help with the above I am happy to help.
> > 
> > Regards,
> > Szabolcs

Thanks Szabolcs, you make sense, I will apply your reviews.


- Dae Myung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68717/#review208672
---


On Sept. 14, 2018, 4:47 p.m., Dae Myung Kang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68717/
> ---
> 
> (Updated Sept. 14, 2018, 4:47 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: sqoop-2639
> https://issues.apache.org/jira/browse/sqoop-2639
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> sqmanually test with testing senario.
> 
> It will just change mysqlCharset by commandline option
> 
> when not given --mysql-charset 
> 
> select query returned 
> 
> ```
> FX
> ```
> 
> given --mysql-charset=UTF-8
> ```
> FX <-- I think ReviewBoard can't support utf8 now. even I posted valid 
> utf-8 string
> ```
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/SqoopOptions.java f06872f9 
>   src/java/org/apache/sqoop/manager/MySQLUtils.java b005c793 
>   src/java/org/apache/sqoop/mapreduce/MySQLExportJob.java e17f3df8 
>   src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java bb751ee6 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 955d3a65 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
> 
> 
> Diff: https://reviews.apache.org/r/68717/diff/2/
> 
> 
> Testing
> ---
> 
> manually test, this is only change commandline cli
> 
> 
> Thanks,
> 
> Dae Myung Kang
> 
>



Re: Review Request 68717: [sqoop-2639] supporting mysql export with specific encoding with --mysql-charset option

2018-09-17 Thread Szabolcs Vasas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68717/#review208672
---



Hi Dae Myung,

Thank you for submitting this patch!
I think it is a very good start but there are a few improvements we should make 
before we can commit it.

- The JIRA says that this is an improvement for the MySQL direct connector but 
the change suggests that it affects the plain MySQL connector only. Can you 
please clarify this?
- Since the option you introduce only affects the MySQL connector I think it 
should be added as an extra argument to org.apache.sqoop.manager.MySQLManager. 
You can see a good example in 
org.apache.sqoop.manager.PostgresqlManager#parseExtraArgs.
- Apart from manual testing you should add automated tests too. I think 
creating a test file with UTF-8 encoding and special characters and exporting 
it into MySQL would be a great test case. You can see a few examples in 
org.apache.sqoop.manager.mysql.DirectMySQLExportTest and 
org.apache.sqoop.manager.mysql.JdbcMySQLExportTest.

If you need any help with the above I am happy to help.

Regards,
Szabolcs


src/java/org/apache/sqoop/tool/BaseSqoopTool.java
Lines 1872 (patched)


Do we need this empty method?


- Szabolcs Vasas


On Sept. 14, 2018, 4:47 p.m., Dae Myung Kang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68717/
> ---
> 
> (Updated Sept. 14, 2018, 4:47 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: sqoop-2639
> https://issues.apache.org/jira/browse/sqoop-2639
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> ---
> 
> sqmanually test with testing senario.
> 
> It will just change mysqlCharset by commandline option
> 
> when not given --mysql-charset 
> 
> select query returned 
> 
> ```
> FX
> ```
> 
> given --mysql-charset=UTF-8
> ```
> FX <-- I think ReviewBoard can't support utf8 now. even I posted valid 
> utf-8 string
> ```
> 
> 
> Diffs
> -
> 
>   src/java/org/apache/sqoop/SqoopOptions.java f06872f9 
>   src/java/org/apache/sqoop/manager/MySQLUtils.java b005c793 
>   src/java/org/apache/sqoop/mapreduce/MySQLExportJob.java e17f3df8 
>   src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java bb751ee6 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 955d3a65 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
> 
> 
> Diff: https://reviews.apache.org/r/68717/diff/2/
> 
> 
> Testing
> ---
> 
> manually test, this is only change commandline cli
> 
> 
> Thanks,
> 
> Dae Myung Kang
> 
>