Re: [GENERAL] Using the database to validate data

2015-07-27 Thread JPLapham
Zdeněk Bělehrádek wrote What about creating a SAVEPOINT before each INSERT, and if the INSERT returns an error, then ROLLBACK TO SAVEPOINT? This way you will have all the insertable data in your table, and you can still ROLLBACK the whole transaction, or COMMIT it if there were no errors.

Re: [GENERAL] Using the database to validate data

2015-07-27 Thread JPLapham
林士博 wrote If I am following correctly, you can do it in your application as follows. 1.begin transaction 2.insert each data. Catch db exception, and save exception message and other information you need to array. 3.in the end ,you can get all the information about the wrong data in array if

Re: [GENERAL] Using the database to validate data

2015-07-27 Thread Adam Brusselback
A little late to the party, but i'll share how I do my data imports / validation for anyone interested. I have a bunch of data that comes in from various sources, and it isn't always guaranteed to be in the correct format, have the right foreign keys, or even the right data types. I have a

Re: [GENERAL] Using the database to validate data

2015-07-27 Thread 林士博
No, when you catch an insert exception , you can save the information you needed to array or something likes that, and continue to do the next insert. In your application, you can write codes as follows. begin transaction loop try do insert catch exception save error info

Re: [GENERAL] Using the database to validate data

2015-07-26 Thread Allan Kamau
To add onto what others a have said. I would use a bash script (and awk) to prepared each record of the raw CSV file with a dataset name, name of the file, timestamp and a serial number and place the newly generated data into a new file. In this bash script, the value of the dataset name, name of

Re: [GENERAL] Using the database to validate data

2015-07-25 Thread Jon Lapham
On 07/23/2015 03:02 PM, Adrian Klaver wrote: http://pgloader.io/ Ok, thanks, I'll look into pgloader's data validation abilities. However, my naive understanding of pgloader is that it is used to quickly load data into a database, which is not what I am looking to do. I want to validate

Re: [GENERAL] Using the database to validate data

2015-07-25 Thread JPLapham
Tim Clarke wrote Shouldn't be too difficult to import those new rows into one table, write a procedure that inserts them into the real table one by one and logs the validation failure if any - committing good rows and rolling back bad. In fact if you could then write the failures to a third

Re: [GENERAL] Using the database to validate data

2015-07-25 Thread rob stone
On Thu, 2015-07-23 at 15:34 -0700, JPLapham wrote: Tim Clarke wrote Shouldn't be too difficult to import those new rows into one table, write a procedure that inserts them into the real table one by one and logs the validation failure if any - committing good rows and rolling back

Re: [GENERAL] Using the database to validate data

2015-07-24 Thread Jeff Janes
On Fri, Jul 24, 2015 at 5:17 AM, Zdeněk Bělehrádek zdenek.belehra...@superhosting.cz wrote: What about creating a SAVEPOINT before each INSERT, and if the INSERT returns an error, then ROLLBACK TO SAVEPOINT? Make sure you release the savepoint if there was no error. Otherwise you will

Re: [GENERAL] Using the database to validate data

2015-07-24 Thread Jeff Janes
On Thu, Jul 23, 2015 at 5:55 AM, JPLapham lap...@jandr.org wrote: Hello, I have an application that occasionally performs large batch inserts of user hand-generated data. Input is a tab delimited file with typically hundreds to a thousand lines of data. Because the data is generated by

Re: [GENERAL] Using the database to validate data

2015-07-24 Thread Zdeněk Bělehrádek
What about creating a SAVEPOINT before each INSERT, and if the INSERT returns an error, then ROLLBACK TO SAVEPOINT? This way you will have all the insertable data in your table, and you can still ROLLBACK the whole transaction, or COMMIT it if there were no errors. It will probably be quite

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread 林士博
Yes. You are right. Do it in postgresql procedure is faster than in application.

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Adrian Klaver
On 07/23/2015 08:20 PM, 林士博 wrote: If I am following correctly, you can do it in your application as follows. 1.begin transaction 2.insert each data. Catch db exception, and save exception message and other information you need to array. 3.in http://3.in the end ,you can get all the information

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread 林士博
If I am following correctly, you can do it in your application as follows. 1.begin transaction 2.insert each data. Catch db exception, and save exception message and other information you need to array. 3.in the end ,you can get all the information about the wrong data in array if there is any.

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Adrian Klaver
On 07/23/2015 12:04 PM, Jon Lapham wrote: On 07/23/2015 03:02 PM, Adrian Klaver wrote: http://pgloader.io/ Ok, thanks, I'll look into pgloader's data validation abilities. However, my naive understanding of pgloader is that it is used to quickly load data into a database, which is not what I

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Tim Clarke
Shouldn't be too difficult to import those new rows into one table, write a procedure that inserts them into the real table one by one and logs the validation failure if any - committing good rows and rolling back bad. In fact if you could then write the failures to a third table with a completely

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Adrian Klaver
On 07/23/2015 05:55 AM, JPLapham wrote: Hello, I have an application that occasionally performs large batch inserts of user hand-generated data. Input is a tab delimited file with typically hundreds to a thousand lines of data. Because the data is generated by hand, there are always many

[GENERAL] Using the database to validate data

2015-07-23 Thread JPLapham
Hello, I have an application that occasionally performs large batch inserts of user hand-generated data. Input is a tab delimited file with typically hundreds to a thousand lines of data. Because the data is generated by hand, there are always many transaction-stopping errors in a typical input