Re: [PERFORM] What is best way to stream terabytes of data into

2005-07-21 Thread Frank Wosczyna
 


 Subject: [PERFORM] What is best way to stream terabytes of 
 data into postgresql?
 
 Preferably via JDBC, but by C/C++ if necessary.
 
 Streaming being the operative word.
 
 Tips appreciated.
 

Hi,

We granted our Java Loader to the Bizgres Open Source,
http://www.bizgres.org/assets/BZ_userguide.htm#50413574_pgfId-110126

You can load from STDIN instead of a file, as long as you prepend the
stream with the Loader Control file, for example:

for name in customer orders lineitem partsupp supplier part;do;cat
TPCH_load_100gb_${name}.ctl /mnt/remote-host/TPCH-Data/${name}.tbl.* |
loader.sh -h localhost -p 10001 -d tpch -t -u mpp; done

You can also run the loader from a remote host as well, with the -h
host being the target system with the Postgres database.

If you have terabytes of data, you might want to set a batch size (-b
switch) to commit occasionally.

Feel free to contact me directly if you have questions.

Thanks,

Frank

Frank Wosczyna
Systems Engineer
Greenplum / Bizgres MPP
www.greenplum.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] What is best way to stream terabytes of data into postgresql?

2005-07-21 Thread Josh Berkus
Jeff,

 Streaming being the operative word.

Not sure how much hacking you want to do, but the TelegraphCQ project is 
based on PostgreSQL:
http://telegraph.cs.berkeley.edu/telegraphcq/v0.2/

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend