[ 
https://issues.apache.org/jira/browse/SQOOP-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073987#comment-14073987
 ] 

Jarek Jarcec Cecho commented on SQOOP-1400:
-------------------------------------------

I got time to do some digging and found out the meaning of the 
{{Integer.MIN_VALUE}} in [following 
note|http://dev.mysql.com/doc/refman/5.0/es/connector-j-reference-implementation-notes.html]:

{quote}
By default, ResultSets are completely retrieved and stored in memory. In most 
cases this is the most efficient way to operate, and due to the design of the 
MySQL network protocol is easier to implement. If you are working with 
ResultSets that have a large number of rows or large values, and can not 
allocate heap space in your JVM for the memory required, you can tell the 
driver to stream the results back one row at a time.

To enable this functionality, you need to create a Statement instance in the 
following manner:

{code}
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
              java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
{code}

The combination of a forward-only, read-only result set, with a fetch size of 
Integer.MIN_VALUE serves as a signal to the driver to stream result sets 
row-by-row. After this any result sets created with the statement will be 
retrieved row-by-row.

There are some caveats with this approach. You will have to read all of the 
rows in the result set (or close it) before you can issue any other queries on 
the connection, or an exception will be thrown.

The earliest the locks these statements hold can be released (whether they be 
MyISAM table-level locks or row-level locks in some other storage engine such 
as InnoDB) is when the statement completes.

If the statement is within scope of a transaction, then locks are released when 
the transaction completes (which implies that the statement needs to complete 
first). As with most other databases, statements are not complete until all the 
results pending on the statement are read or the active result set for the 
statement is closed.

Therefore, if using streaming results, you should process them as quickly as 
possible if you want to maintain concurrent access to the tables referenced by 
the statement producing the result set.
{quote}

It seems to me that the purpose of the original code was to enable row-by-row 
processing in case that user did not explicitly specify fetch size. I do feel 
that in such case we should use the default fetch size though and not enable 
row-by-row streaming of results. Would you agree [~richard_zhou]?

> Failed to import data using mysql-connector-java-5.1.17.jar
> -----------------------------------------------------------
>
>                 Key: SQOOP-1400
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1400
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/mysql
>    Affects Versions: 1.4.5
>         Environment: centos-6.4
>            Reporter: Richard
>             Fix For: 1.4.5
>
>         Attachments: patch.diff
>
>
> sqoop 1.4.5 + mysql-connector-java-5.1.31.jar, which is the latest version 
> (works fine)
> sqoop 1.4.5 + mysql-connector-java-5.1.17.jar, which is the default version 
> on centos 6.4 installed by rpm (fails with error message showed below)
> {code}
> 14/07/24 10:44:48 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM `test1` AS t LIMIT 1
> 14/07/24 10:44:48 ERROR manager.SqlManager: Error reading from database: 
> java.sql.SQLException: Streaming result set 
> com.mysql.jdbc.RowDataDynamic@1cfabc3a is still active. No statements may be 
> issued when any streaming result sets are open and in use on a given 
> connection. Ensure that you have called .close() on any active streaming 
> result sets before attempting more queries.
> java.sql.SQLException: Streaming result set 
> com.mysql.jdbc.RowDataDynamic@1cfabc3a is still active. No statements may be 
> issued when any streaming result sets are open and in use on a given 
> connection. Ensure that you have called .close() on any active streaming 
> result sets before attempting more queries.
>       at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:934)
>       at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:931)
>       at 
> com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2735)
>       at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1899)
>       at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
>       at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2619)
>       at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2569)
>       at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1521)
>       at 
> com.mysql.jdbc.ConnectionImpl.getMaxBytesPerChar(ConnectionImpl.java:3003)
>       at com.mysql.jdbc.Field.getMaxBytesPerCharacter(Field.java:602)
>       at 
> com.mysql.jdbc.ResultSetMetaData.getPrecision(ResultSetMetaData.java:445)
>       at 
> org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:285)
>       at 
> org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240)
>       at 
> org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226)
>       at 
> org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
>       at 
> org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773)
>       at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578)
>       at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
>       at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
>       at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
>       at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
>       at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
>       at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
>       at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
> 14/07/24 10:44:48 ERROR tool.ImportTool: Encountered IOException running 
> import job: java.io.IOException: No columns to generate for ClassWriter
>       at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1584)
>       at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
>       at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
>       at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
>       at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
>       at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
>       at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
>       at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to