Excellent information here. Thanks Lee and Peter.
Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> On Oct 24, 2016, at 6:57 AM, Peter Wicks (pwicks) wrote:
>
> Prabhu,
>
> Lee mentioned
Prabhu,
Lee mentioned making sure you have good indexes, but I would caution you on
this point. If you have a unique constraint then SQL Server will build an
index on this automatically, but I would suggest dropping all other indexes
that aren’t related to data integrity. Each time SQL Server
Hello Prabhu,
50 minutes is a good start! Now we have to *determine where the next
bottleneck is *-check to see where the flow files are queueing. You can
also check the "average task duration" statistic for each processor. I
suspect the bottleneck is at PutSQL and will carry this assumption
Lee,
I have tried your suggested flow which can able to insert the data into sql
server in 50 minutes And it also take long time.
*==>*your Query:
*You might be processing the entire dat file (instead of a single row) for
each record.*
How can i process entire dat file into SQL Server?
Prabu,
In order to move 3M rows in 10 minutes, you'll need to process 5000
rows/second.
During your 4 hour run, you were processing ~200 rows/second.
Without any new optimizations you'll need ~25 threads and sufficient memory
to feed the threads. I agree with Mark and you should be able to get
Prabhu,
Certainly, the performance that you are seeing, taking 4-5 hours to move 3M
rows into SQLServer is far from
ideal, but the good news is that it is also far from typical. You should be
able to see far better results.
To help us understand what is limiting the performance, and to make
Hi All,
I have tried to perform the below operation.
dat file(input)-->JSON-->SQL-->SQLServer
GetFile-->SplitText-->SplitText-->ExtractText-->ReplaceText-->ConvertJsonToSQL-->PutSQL.
My Input File(.dat)-->3,00,000 rows.
*Objective:* Move the data from '.dat' file into SQLServer.
I can able