[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894975#action_12894975 ] Aaron Kimball commented on PIG-1229: Haven't looked at how you're using hsqldb in this patch, but I've got a lot of experience using HSQLDB for testing. If you're running one or more tests in a single process that requires an HSQLDB-backed database, you do not need to create a new instance of Server. You can just set your JDBC connect string to {{jdbc:hsqldb:mem:foodbname}} and get a {{Connection}} instance to a memory-backed single-process database called {{foodbname}}. This database will exist for the lifetime of the Java process. You can have multiple {{Connection}} instances (concurrently or serially) open to this database and it will function like you expect a database to work like. The advantage of not using a server is that this does not require binding a port; therefore you can run multiple tests concurrently without worrying about collisions. Similarly, there's no need to use the {{jdbc:hsqldb:file}} protocol unless you want to restore the contents of the database in a subsequent process. When your Java process ends, you won't have a bonus file to clean up with {{jdbc:hsqldb:mem}}. Of course, if you're testing with {{MiniMRCluster}} or something, you'll want to start a Server so that the external mapper processes can connect to the same database via {{jdbc:hsqldb:hsql://server:port/dbname}}. allow pig to write output into a JDBC db Key: PIG-1229 URL: https://issues.apache.org/jira/browse/PIG-1229 Project: Pig Issue Type: New Feature Components: impl Reporter: Ian Holsman Assignee: Ankur Priority: Minor Fix For: 0.8.0 Attachments: jira-1229-final.patch, jira-1229-final.test-fix.patch, jira-1229-v2.patch, jira-1229-v3.patch, pig-1229.2.patch, pig-1229.patch UDF to store data into a DB -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833998#action_12833998 ] Aaron Kimball commented on PIG-1229: Looks much better - thanks for adding the test case too. Including hsqldb.jar in your patch didn't work, by the way -- you'll need to attach that jar separately to the issue I think. allow pig to write output into a JDBC db Key: PIG-1229 URL: https://issues.apache.org/jira/browse/PIG-1229 Project: Pig Issue Type: New Feature Components: impl Reporter: Ian Holsman Assignee: Ankur Priority: Minor Fix For: 0.6.0 Attachments: jira-1229.patch UDF to store data into a DB -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831052#action_12831052 ] Aaron Kimball commented on PIG-1229: Ian, This class looks reasonable to me. You'll probably need to format this as a patch to get it accepted into the project though. Is there a test plan for this code and/or unit tests? Some database-specific things I've noticed: * You create a PreparedStatement, and call its executeUpdate() method several times then call close() on the statement. This assumes you're in Auto-commit mode; I think you should configure the commit mode explicitly when creating the connection. Also, you'll probably get a lot better performance if you use addBatch() / executeBatch() for your batch size rather than individual executeUpdate() statements. You should then call connection.commit() and ps.clear() rather than closing the prepared statement and compiling a new one. * If user and pass are null, I think you may need to use DriverManager.getConnection(jdbcUrl) instead of DriverManager.getConnection(jdbcUrl, null, null). Worth a unit test. * See org.apache.hadoop.mapreduce.lib.db.DBOutputFormat in the MapReduce project for some similar code to take inspiration from. allow pig to write output into a JDBC db Key: PIG-1229 URL: https://issues.apache.org/jira/browse/PIG-1229 Project: Pig Issue Type: New Feature Components: impl Reporter: Ian Holsman Priority: Minor Attachments: DbStorage.java UDF to store data into a DB -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.