[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db

2010-08-03 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12894975#action_12894975
 ] 

Aaron Kimball commented on PIG-1229:


Haven't looked at how you're using hsqldb in this patch, but I've got a lot of 
experience using HSQLDB for testing.

If you're running one or more tests in a single process that requires an 
HSQLDB-backed database, you do not need to create a new instance of Server. You 
can just set your JDBC connect string to {{jdbc:hsqldb:mem:foodbname}} and get 
a {{Connection}} instance to a memory-backed single-process database called 
{{foodbname}}. This database will exist for the lifetime of the Java process. 
You can have multiple {{Connection}} instances (concurrently or serially) open 
to this database and it will function like you expect a database to work like. 
The advantage of not using a server is that this does not require binding a 
port; therefore you can run multiple tests concurrently without worrying about 
collisions. Similarly, there's no need to use the {{jdbc:hsqldb:file}} protocol 
unless you want to restore the contents of the database in a subsequent 
process. When your Java process ends, you won't have a bonus file to clean up 
with {{jdbc:hsqldb:mem}}.

Of course, if you're testing with {{MiniMRCluster}} or something, you'll want 
to start a Server so that the external mapper processes can connect to the same 
database via {{jdbc:hsqldb:hsql://server:port/dbname}}. 



 allow pig to write output into a JDBC db
 

 Key: PIG-1229
 URL: https://issues.apache.org/jira/browse/PIG-1229
 Project: Pig
  Issue Type: New Feature
  Components: impl
Reporter: Ian Holsman
Assignee: Ankur
Priority: Minor
 Fix For: 0.8.0

 Attachments: jira-1229-final.patch, jira-1229-final.test-fix.patch, 
 jira-1229-v2.patch, jira-1229-v3.patch, pig-1229.2.patch, pig-1229.patch


 UDF to store data into a DB

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db

2010-02-15 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833998#action_12833998
 ] 

Aaron Kimball commented on PIG-1229:


Looks much better - thanks for adding the test case too. Including hsqldb.jar 
in your patch didn't work, by the way -- you'll need to attach that jar 
separately to the issue I think.


 allow pig to write output into a JDBC db
 

 Key: PIG-1229
 URL: https://issues.apache.org/jira/browse/PIG-1229
 Project: Pig
  Issue Type: New Feature
  Components: impl
Reporter: Ian Holsman
Assignee: Ankur
Priority: Minor
 Fix For: 0.6.0

 Attachments: jira-1229.patch


 UDF to store data into a DB

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db

2010-02-08 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831052#action_12831052
 ] 

Aaron Kimball commented on PIG-1229:


Ian, 

This class looks reasonable to me. You'll probably need to format this as a 
patch to get it accepted into the project though.

Is there a test plan for this code and/or unit tests?

Some database-specific things I've noticed: 
* You create a PreparedStatement, and call its executeUpdate() method several 
times then call close() on the statement. This assumes you're in Auto-commit 
mode; I think you should configure the commit mode explicitly when creating the 
connection. Also, you'll probably get a lot better performance if you use 
addBatch() / executeBatch() for your batch size rather than individual 
executeUpdate() statements. You should then call connection.commit() and 
ps.clear() rather than closing the prepared statement and compiling a new one. 
* If user and pass are null, I think you may need to use 
DriverManager.getConnection(jdbcUrl) instead of 
DriverManager.getConnection(jdbcUrl, null, null). Worth a unit test.
* See org.apache.hadoop.mapreduce.lib.db.DBOutputFormat in the MapReduce 
project for some similar code to take inspiration from. 


 allow pig to write output into a JDBC db
 

 Key: PIG-1229
 URL: https://issues.apache.org/jira/browse/PIG-1229
 Project: Pig
  Issue Type: New Feature
  Components: impl
Reporter: Ian Holsman
Priority: Minor
 Attachments: DbStorage.java


 UDF to store data into a DB

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.