On Wed, Oct 19, 2011 at 9:22 PM, Jeremy Cole <[email protected]> wrote:
> Henrik,
>
>> Regarding parsing: The way LOAD DATA LOCAL INFILE works currently, it
>> sends the CSV file to server and there it is stored in /tmp/, read,
>> parsed and fed directly into the table. Possibly this is more
>> efficient than if the client would rewrite CSV data into SQL INSERTs,
>> then the server has to parse the SQL anyway, and SQL is quite wordy.
>> LOAD DATA is very efficient, I've managed to get 100k rows inserted by
>> second (small rows). I'm not sure you can easily get that with
>> INSERTs, (but maybe if you combine a lot of rows into one INSERT).
>>
>> It seems to me ideally you could just modify the current code so that
>> it will never read a file from the server, but instead it could
>> consume CSV data coming in from the client without saving it as a
>> temporary file first.
>
> Are you certain about any of this? I haven't looked at the code in a long
> time, but when I refactored the local infile code in the C API so many years
> ago, this is exactly what was happening.

I'm not certain in the sense that I would have read the code, but I did RTFM:
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
"When using LOCAL with LOAD DATA, a copy of the file is created in the
server's temporary directory. "

To prove me wrong and you right, I guess someone has to look at the
source now...

> No temporarily file was created on
> the server at all. This is one of the biggest reasons to use it -- it
> doesn't require any extra space, permissions, or anything, on the server
> side. It's just a compact (if painful to escape) way to bulk-load data.

If you're right, that would be pretty much the ideal solution. Based
on the security argument and general Drizzle ethos, disabling the
ability to read any file residing on the server could then be
disabled, but the code would then be left to perform this task.

Regarding Shrews and others' comments: Good use cases you mention,
such as select into outfile doing joins. Bet drizzledump doesn't do
that! It seems to me the mysql client still comes close to doing this.
There is no --tab option but we all know there's awesome xml support
that was so fashionable 5 years ago:

$ mysql -uroot --xml -e "select * from test.testtable limit 10;"
<?xml version="1.0"?>

<resultset statement="select * from test.testtable limit 10"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
  <row>
        <field name="id">570633</field>
        <field name="sometext">test2</field>
        <field name="longtext">zzezkequox</field>
  </row>

  <row>
        <field name="id">570634</field>
        <field name="sometext">test2</field>
        <field name="longtext">gdtyxvupgm</field>
  </row>
[...cutting here to save your eyes...]

In fact, you can get also tab delimited output, but the switch
"--silent" could perhaps be renamed/given a synonym that has the words
"tab" or "csv" in it:

$ mysql -uroot --silent -e "select * from test.testtable limit 10;"
id      sometext        longtext
570633  test2   zzezkequox
570634  test2   gdtyxvupgm
570643  test2   hprodnyrbt
570644  test2   bjxlecfmzs
570645  test2   unmhwiecyn
570638  test2   tadkubbowy
570646  test2   hamxockdsr
570647  test2   tngtjozrxi
570648  test2   amadubthlc
570650  test2   ctzvhnxkji

Another good point was that select into outfile allows you to define
the delimiters and whether to use quotes or not. So should this be
supported in a client-side replacement, or is the standard unix way
good enough?

$ mysql -uroot --silent -e "select * from test.testtable limit 10;"|
sed -e "s/\t/;/g"
570633;test2;zzezkequox
570634;test2;gdtyxvupgm
570643;test2;hprodnyrbt
570644;test2;bjxlecfmzs
570645;test2;unmhwiecyn
570638;test2;tadkubbowy
570646;test2;hamxockdsr
570647;test2;tngtjozrxi
570648;test2;amadubthlc
570650;test2;ctzvhnxkji

Personally I'm perhaps willing to accept piping through sed as a valid
solution. But beyond that, I also reject the argument "you can use
your favorite scripting language". This is a useful feature and it is
good to continue to support it.

Excellent thread btw, thanks to everyone for chiming in.

henrik
-- 
[email protected]
+358-40-8211286 skype: henrik.ingo irc: hingo
www.openlife.cc

My LinkedIn profile: http://www.linkedin.com/profile/view?id=9522559

_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to