Hi Yichao Yu!
Thank you!
`IOBuffer()` solve:
pgavals
586973-element Array{Tuple{Int64,Int64,Float64,Float64,Float64,Float64},1}:
(10,2,54.195835,142.929169,0.15,333.63588213260454)
(10,2,54.195835,142.9375,0.15,333.6424133198646)
...
A = IOBuffer()
@time for i in pgavals
print(A, i, ',')
end
2.524894 seconds (17.02 M allocations: 658.817 MB, 3.67% gc time) ### <--
very fast
VALS = takebuf_string(A);
VALS = chop(VALS);
@time insquery = """
INSERT INTO srss_api_production.pga_data
(report_id, version_id, lat, lon, pga_value, ep_dis)
values
$VALS;
""";
0.024216 seconds (7 allocations: 28.561 MB, 4.46% gc time)
@time mysql_execute(con, insquery)
307.948797 seconds (7 allocations: 288 bytes) ### <-- slow connection, it's
normal
586973 ### <-- N of inserted rows = N of tuple's elements, so ALL FINE!
For history: don't foget increase `max_allowed_packet` size.
Andrey
On Tuesday, July 5, 2016 at 2:11:39 PM UTC+11, Yichao Yu wrote:
>
> On Mon, Jul 4, 2016 at 10:13 PM, Andrey Stepnov <[email protected]
> <javascript:>> wrote:
> > Hi everyone!
> >
> > I tryed to construct one INSERT request.
> >
> > The idea:
> >
> > julia> pgavals
> #Array{Tuple{Int64,Int64,Float64,Float64,Float64,Float64},1}:
> >
> > VALS = ""
> > for i in pgavals
> > VALS = string(VALS, "$i,")
> > end
> >
> > VALS = chop(VALS) # remove last ',' character
> >
> > insquery = """
> > INSERT INTO srss_api_production.pga_data
> > (report_id, version_id, lat, lon, pga_value, ep_dis)
> > values
> > $VALS;
> > """
> > mysql_execute(con, insquery)
> >
> >
> >
> > It works fine with small arrays and tuples.
> >
> > BUT it impossible with ~500K elements in tuple due to very slow string
> > concatenation operations. Please look at this example:
> >
> > julia> pgavals
> > 1617840-element
> Array{Tuple{Int64,Int64,Float64,Float64,Float64,Float64},1}:
> > (10,2,54.495834,137.504166,0.0,515.0428287161928)
> > (10,2,54.495834,137.512497,0.0,514.6500363528039)
> > (10,2,54.495834,137.520828,0.0,514.2575525299238)
> > ...
> >
> > VALS = ASCIIString("")
> > @time for i in pgavals
> > VALS = VALS * "$i,"
>
> This is O(n^2)
>
> use a `IOBuffer()` and print into it instead.
>
> > end
> >
> > #### it spens more than 1 hour and even not copleted!!!!
> >
> >
> > My platform:
> >
> > MacBookPro11,1 / 2.8 GHz Intel Core i7
> >
> > $ julia
> > _
> > _ _ _(_)_ | A fresh approach to technical computing
> > (_) | (_) (_) | Documentation: http://docs.julialang.org
> > _ _ _| |_ __ _ | Type "?help" for help.
> > | | | | | | |/ _` | |
> > | | |_| | | | (_| | | Version 0.4.5 (2016-03-18 00:58 UTC)
> > _/ |\__'_|_|_|\__'_| |
> > |__/ | x86_64-apple-darwin15.5.0
> >
> >
> > Is there any other way?
> >
> > Andrey
> >
> >
> > On Monday, July 4, 2016 at 9:47:08 AM UTC+11, Jeffrey Sarnoff wrote:
> >>
> >> While I am not a user of MariaDB/MySQL, this may be of help.
> >>
> >> The most common way to speed bulk record insertion avoids your current
> 1
> >> record per insertion loop by inserting multiple values [rows] with each
> >> INSERT. In SQL:
> >>
> >> INSERT INTO MyTable
> >> (id, firstname, lastname)
> >> VALUES
> >> (1, "Jane", "Smythe") ,
> >> (2, "John", "Jones") ,
> >> (3, "April", "Marlson") ;
> >>
> >>
> >> Using a loop that increments in larger steps (n at once) and getting
> >> the corresponding n records into a single INSERTion should improve
> things.
> >> This sort of loop behaves in the following way (if you had 10 or 11
> >> records in this example, the final few would be inserted individually
> at the
> >> end).
> >>
> >> for i in 1:3:9
> >> println( (i, i+1, i+2) )
> >> end
> >> (1,2,3)
> >> (4,5,6)
> >> (7,8,9)
> >>
> >>
> >> On Sunday, July 3, 2016 at 1:54:03 PM UTC-4, Andrey Stepnov wrote:
> >>>
> >>> Hi Community!
> >>>
> >>> I just try to INSERT around 580K rows to MaraDB-table from array B:
> >>>
> >>> julia> B
> >>> 586973x4 Array{Float64,2}:
> >>> 54.1958 142.929 0.15 333.636
> >>> …
> >>>
> >>>
> >>> using MySQL
> >>> con = mysql_connect("127.0.0.1", "DB", "PASS", "srss_api_production")
> >>>
> >>> # convert to tuple for big insert
> >>> pgavals = Array{Tuple{Float64,Float64,Float64,Float64}}(0)
> >>> for i in 1:length(B[:,1])
> >>> pgavals = push!(pgavals, (B[i,1], B[i,2], B[i,3], B[i,4]))
> >>> end
> >>>
> >>> julia> pgavals
> >>> 586973-element Array{Tuple{Float64,Float64,Float64,Float64},1}:
> >>> (54.195835,142.929169,0.15,333.63588213260454)
> >>> ...
> >>>
> >>>
> >>> # prepare
> >>> mysql_stmt_prepare(
> >>> con,
> >>> """
> >>> INSERT INTO srss_api_production.pga_data
> >>> (report_id, version_id, lat, lon, pga_value,
> >>> ep_dis)
> >>> values
> >>> (10, 2, ?, ?, ?, ?);
> >>> """
> >>> )
> >>> # insert
> >>> for i in pgavals
> >>> mysql_execute(
> >>> con,
> >>> [MYSQL_TYPE_FLOAT,
> >>> MYSQL_TYPE_FLOAT,
> >>> MYSQL_TYPE_FLOAT,
> >>> MYSQL_TYPE_FLOAT],
> >>> i
> >>> )
> >>> end
> >>>
> >>>
> >>> AND it extremely slow! I think because it generates 1 INSERT request
> for
> >>> each row…
> >>>
> >>> Please help me find out simple (or probably not) way to make INSERT
> >>> faster.
> >>>
> >>> Andrey
> >>>
> >
>