Bug#675106: ITP: pgbulkload -- A high speed data loading utility for PostgreSQL

2012-05-29 Thread Alexander Kuznetsov
Package: wnpp
Severity: wishlist
Owner: Alexander Kuznetsov a...@cpan.org

* Package name: pgbulkload
  Version : 3.1.1
  Upstream Author :
Takahiro Itagakiitagaki.takahiro @nospam@ gmail.com
Masao Fujii masao.fujii @nospam@ gmail.com
Mitsuru Hasegawahasegawa @nospam@ metrosystems.co.jp
Masahiko Sakamoto   sakamoto_masahiko_b1 @nospam@ 
lab.ntt.co.jp
Toru SHIMOGAKI  shimogaki.toru @nospam@ oss.ntt.co.jp
* URL : http://pgfoundry.org/projects/pgbulkload/
* License : BSD
  Programming Lang: C, SQL
  Description : A high speed data loading utility for PostgreSQL
 pg_bulkload is designed to load huge amount of data to a database.
 You can choose whether database constraints are checked and how many errors are
 ignored during the loading. For example, you can skip integrity checks for
 performance when you copy data from another database to PostgreSQL. On the
 other hand, you can enable constraint checks when loading unclean data.
 .
 The original goal of pg_bulkload was an faster alternative of COPY command in
 PostgreSQL, but version 3.0 or later has some ETL features like input data
 validation and data transformation with filter functions.
 .
 In version 3.1, pg_bulkload can convert the load data into the binary file
 which can be used as an input file of pg_bulkload. If you check whether
 the load data is valid when converting it into the binary file, you can skip
 the check when loading it from the binary file to a table. Which would reduce
 the load time itself. Also in version 3.1, parallel loading works
 more effectively than before.



-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/20120529222006.19901.16867.reportbug@assa103.local



Re: Bug#675106: ITP: pgbulkload -- A high speed data loading utility for PostgreSQL

2012-05-29 Thread Ivan Shmakov
 Alexander Kuznetsov a...@cpan.org writes:

[…]

(Some wording fixes and suggestions.)

  Description : A high speed data loading utility for PostgreSQL
  pg_bulkload is designed to load huge amount of data to a database.
  You can choose whether database constraints are checked and how many errors 
  are

If “You can…” here starts a new paragraph, there's ought to be
an empty (“.”) line.  And if not, the linebreak here came a bit
too early than necessary.

  ignored during the loading. For example, you can skip integrity checks for
  performance when you copy data from another database to PostgreSQL. On the
  other hand, you can enable constraint checks when loading unclean data.
  .

Are “constraint checks” different to “integrity checks” in the
above?  Unless they are, it should rather be, e. g.:

   … For example, you can skip integrity checks for performance when you
   copy data from another database to PostgreSQL, or have them in place
   when loading potentially unclean data.

  The original goal of pg_bulkload was an faster alternative of COPY command in

   … was /a/ faster…

Or, perhaps: … was to provide a faster…

  PostgreSQL, but version 3.0 or later has some ETL features like input data
  validation and data transformation with filter functions.
  .

   … but as of version 3.0 some ETL features… were added.

And what's ETL, BTW?

  In version 3.1, pg_bulkload can convert the load data into the binary file
  which can be used as an input file of pg_bulkload. If you check whether

Perhaps:

   As of version 3.1, pg_bulkload can dump the preprocessed data into a
   binary file, allowing for…

(Here, the purpose should be mentioned.  Is this for improving
the performance of later multiple “bulkloads”, for instance?)

  the load data is valid when converting it into the binary file, you can skip
  the check when loading it from the binary file to a table. Which would reduce
  the load time itself. Also in version 3.1, parallel loading works
  more effectively than before.

s/effectively/efficiently/.  But the whole sentence makes little
sense, as the earlier versions weren't packaged for Debian.

-- 
FSF associate member #7257


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/86wr3ujphk@gray.siamics.net