Incorrect DBInputFormat transaction context -------------------------------------------
Key: HADOOP-5960 URL: https://issues.apache.org/jira/browse/HADOOP-5960 Project: Hadoop Core Issue Type: Bug Components: mapred Affects Versions: 0.20.0, 0.19.1, 0.19.0 Environment: Mac OSX 10.5.6, IntelliJ 7.0.5 Reporter: Yuchen In my Map/Reduce job, I use DBInputFormat to get the original tasks for its convenience. I also need to update my mysql db occasionally in our reducer. Because I need to update mysql db, instead of "insert", I cannot use DBOutputFormat. So I use my own JDBC call. I make my own connection like this: Class.forName("com.mysql.jdbc.Driver").newInstance(); conn = DriverManager.getConnection(jdbcUrl); However, everytime when I try to do the update, I got an SQL exception "transaction lock time out; try restarting transction" -- even though I didn't use transaction at all in my update (setAutoCommit to false). Digging into the hadoop code, I found in DBInputFormat, there are these lines: this.connection.setAutoCommit(false); connection.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE); When I comment them out (and the connection.commit()) and everything works fine. I also found the connection in DBInputFormat is never closed. I am wondering why we need to set the transaction / transaction isolation since we are in DBInputFormat? and why I can't overwrite it in my jdbc call even if explicitly set autocommit to false and transaction isolation type to default (repeat-read). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.