Re: [sqlite] about the apparently arriving soon "threads"

2014-04-17 Thread big stone
Hi James,

You're right :  my example is indeed a "4 independant process" rather than
"4 threads in the same process".

The job I need to do is unchanged : transform a big input table in a big
output table.

I hope that SQlite improvements will allow us to approach this "2x" (or
more) boost in the future, without the pain of managing parallelisation
outside of SQL.

Regards,
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-09 Thread James K. Lowden
On Wed, 9 Apr 2014 19:07:27 +0200
big stone  wrote:

> Threading Plumbery  is managed via DOS ".bat commands, as below :
> - a "main.bat" dos command :
>  . pre-clears the 4 "ok finished" files,
>  . launch the 4 threads,
>  . then check every 2 seconds that all "ok finished" files are
> generated.
> - a "test_sqlite_this.bat" command launcher was necessary to pass
> parameters for each sqlite session,
...
> rem each thread is writing a "ok" file after it outputed its 1/4th of
> big result
...
> start cmd /C  "%~dp0test_sqlite_this.bat" %sqlite%
> test_sqlite3_script_v30_0.txt  zout_sqlite_v30_0ok.txt debug
> start cmd /C  "%~dp0test_sqlite_this.bat" %sqlite%
> test_sqlite3_script_v30_1.txt  zout_sqlite_v30_1ok.txt nodebug
...

I think it bears mentioning that "start cmd /c" starts a separate
*process*, each in its own address space, each with its own file
descriptors.  

Calling them threads is technically accurate -- every process is 1 or
more threads -- but misleading because it implies issues that don't
exist between distinct processes.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-09 Thread big stone
Hi Simon,

About my test :
- principal input fact file is 220 000 line of 5 fields ( 7 389 Ko in Utf-8
on a windows pc)
- other files are 65 Ko
- initial and final data is on a 7200 rpm rotating disk,
- sqlite database(s), one per thread, is in ":memory:".

Threading Plumbery  is managed via DOS ".bat commands, as below :
- a "main.bat" dos command :
 . pre-clears the 4 "ok finished" files,
 . launch the 4 threads,
 . then check every 2 seconds that all "ok finished" files are generated.
- a "test_sqlite_this.bat" command launcher was necessary to pass
parameters for each sqlite session,



**main.bat** file
rem let's try parallel

cd %~dp0
set sqlite=%~dp0sqlite3.exe
echo %time%
@echo off

rem each thread is writing a "ok" file after it outputed its 1/4th of big
result

del /q "%~dp0zout_sqlite_v30_0ok.txt"
del /q "%~dp0zout_sqlite_v30_1ok.txt"
del /q "%~dp0zout_sqlite_v30_2ok.txt"
del /q "%~dp0zout_sqlite_v30_3ok.txt"
start cmd /C  "%~dp0test_sqlite_this.bat" %sqlite%
test_sqlite3_script_v30_0.txt  zout_sqlite_v30_0ok.txt debug
start cmd /C  "%~dp0test_sqlite_this.bat" %sqlite%
test_sqlite3_script_v30_1.txt  zout_sqlite_v30_1ok.txt nodebug
start cmd /C  "%~dp0test_sqlite_this.bat" %sqlite%
test_sqlite3_script_v30_2.txt  zout_sqlite_v30_2ok.txt nodebug
start cmd /C  "%~dp0test_sqlite_this.bat" %sqlite%
test_sqlite3_script_v30_3.txt  zout_sqlite_v30_3ok.txt nodebug

set zf="%~dp0zout_sqlite_v30_0ok.txt"

:step0
echo waiting %zf%
rem sleep 1
ping -n 2 127.0.0.1>nul

set zf="%~dp0zout_sqlite_v30_0ok.txt"
if not  exist %zf% goto step0

set zf="%~dp0zout_sqlite_v30_1ok.txt"
if not  exist %zf% goto step0

set zf="%~dp0zout_sqlite_v30_2ok.txt"
if not  exist %zf% goto step0

set zf="%~dp0zout_sqlite_v30_3ok.txt"
if not  exist %zf% goto step0

echo %time%
pause

** test_sqlite_this.bat** file
@echo off
rem %1=sqlite exec , %2=input file , %3=termination_file ,"%4"=="debug" (or
nothing)

cd %~dp0
echo %time%
echo "sqlite=%1"
echo "inputfile=%2"
echo "terminationfile=%3"
@echo on
%1 ":memory:"<%2

echo %time%>%3

if "%4"=="debug" pause
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-08 Thread Keith Medcalf
>On Tue, Apr 8, 2014 at 11:00 PM, big stone  wrote:
>> Hi,
>>
>> I did experiment splitting my workload in 4 threads on my cpu i3-350m
>to
>> see what are the scaling possibilities.
>>
>> Timing :
>> 1 cpu = 28 seconds
>> 2 cpu = 16 seconds
>> 3 cpu = 15 seconds
>> 4 cpu = 14 seconds
>>
>
>If the info at http://ark.intel.com/products/43529/Intel-Core-i3-350M-
>Processor-3M-Cache-2_26-GHz
>is right, you have 2 cores, each having 2 threads. They're logically
>"cores", but physically not so. My tests with any multi-threading
>benchmarking including parallel quicksort showed that a similar i3
>mobile processor rarely benefit after 2 threads, probably cache
>coherence penalty is the cause. Desktop Intel Core i5-2310, for
>example, is a different beast (4 cores/4 threads), 3 threads almost
>always was x3 times faster, 4 threads - with a little drop.
>
>It all still depends on the application. Once I stopped believing a
>2-threaded Atom would show x2 in any of tests I made, when on one
>graphical one it finally made it. But still if number of threads are
>bigger than number of cores then it's probably a legacy of
>HyperThreading hardware Intel started multi-threading with

It greatly depends on the processor and whether the so-called hyper threads are 
real threads or half-assed threads.  Some Intel processors support real SMP 
threads in which there is no difference if your code is dispatched on the "main 
thread" or the "hyper-thread".  Other processors use very fake threads in which 
only a very small percentage of the ALU is available to the "hyper-thread" and 
only the main thread has access to the entire execution unit.  The former is 
good, the latter usually makes things run slower when multiple threads are 
running unless you and/or the application are smart enough to ensure that you 
set the thread affinity so that the thread dispatched to the half-assed thread 
never needs to access the parts of the execution unit that are never available 
to that thread.  If you do not take such care, then you will continually stall 
the decoding pipeline and the RISC microcode execution stream as the processor 
switches threads between the two pipelines.

For traditional (aka useless) hyper-threaded processors, you are usually better 
off to disable hyper-threading in the BIOS and dedicate all the execution unit 
resources to a single thread.  For processors that support SMP hyper-threading 
you generally get excellent multiprogramming ratio's until all the pipelines 
and execution units are fully consumed (assuming sufficient L1 and L2 cache 
that is well designed, and good code and data locality).  Often for a decent 
mix of compute and I/O, this means that you can load up almost full compute on 
all threads simultaneously and almost fully overlap all I/O waits with useful 
compute -- just like a real computer.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-08 Thread Max Vlasov
On Tue, Apr 8, 2014 at 11:00 PM, big stone  wrote:
> Hi,
>
> I did experiment splitting my workload in 4 threads on my cpu i3-350m to
> see what are the scaling possibilities.
>
> Timing :
> 1 cpu = 28 seconds
> 2 cpu = 16 seconds
> 3 cpu = 15 seconds
> 4 cpu = 14 seconds
>

If the info at 
http://ark.intel.com/products/43529/Intel-Core-i3-350M-Processor-3M-Cache-2_26-GHz
is right, you have 2 cores, each having 2 threads. They're logically
"cores", but physically not so. My tests with any multi-threading
benchmarking including parallel quicksort showed that a similar i3
mobile processor rarely benefit after 2 threads, probably cache
coherence penalty is the cause. Desktop Intel Core i5-2310, for
example, is a different beast (4 cores/4 threads), 3 threads almost
always was x3 times faster, 4 threads - with a little drop.

It all still depends on the application. Once I stopped believing a
2-threaded Atom would show x2 in any of tests I made, when on one
graphical one it finally made it. But still if number of threads are
bigger than number of cores then it's probably a legacy of
HyperThreading hardware Intel started multi-threading with

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-08 Thread Simon Slavin

On 8 Apr 2014, at 8:00pm, big stone  wrote:

> I did experiment splitting my workload in 4 threads on my cpu i3-350m to
> see what are the scaling possibilities.
> 
> Timing :
> 1 cpu = 28 seconds
> 2 cpu = 16 seconds
> 3 cpu = 15 seconds
> 4 cpu = 14 seconds
> 
> Analysis :
> - sqlite is such a small foot-print in memory, it is really scaling well
> with the number of cores,
> 
> - hyper-threaded cores are useless for a database workload,
>  (it was the first time I had the opportunity to really use 4 cores, so
> the first time I notice)
> 
> - but the plumbery I personnaly need to manage threading out of sqlite
> makes it not practical outside of a "test tube".

That's very interesting, Stone.  I especially like your concluding sentence.

Can I ask how big your database was and where your database was held ?  Was it 
on a rotating disk, on a solid state disk, or in memory ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-08 Thread big stone
Hi,

I did experiment splitting my workload in 4 threads on my cpu i3-350m to
see what are the scaling possibilities.

Timing :
1 cpu = 28 seconds
2 cpu = 16 seconds
3 cpu = 15 seconds
4 cpu = 14 seconds

Analysis :
- sqlite is such a small foot-print in memory, it is really scaling well
with the number of cores,

- hyper-threaded cores are useless for a database workload,
  (it was the first time I had the opportunity to really use 4 cores, so
the first time I notice)

- but the plumbery I personnaly need to manage threading out of sqlite
makes it not practical outside of a "test tube".
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] about the apparently arriving soon "threads"

2014-04-05 Thread Richard Hipp
On Sat, Apr 5, 2014 at 9:16 AM, big stone  wrote:

> Hello,
>
> I see the "threads" branch of fossil has a lot of activity and seems close
> to be finalized.
>

No, it still has a long way to go.


>
> Will it be activated by default on the downloadable executable for windows
> ?
>

Probably not.  It might be possible to activate this feature using a
PRAGMA.  Or, it might require a start-time or compile-time setting.  That
is still all very much in flux.


> Will it apply to parallelisable CTE expression ?
>

No.  Multiple cores will only be used by CREATE INDEX and by large ORDER BY
or GROUP BY statements that cannot be satisfied using indices.

Devoting 4 cores to the task allows large ORDER BY statements to go about
25% faster.


>
> Will it be possible from 1 sqlite.exe command line (or 1 python
> sqlite.execute) to launch several SQL in parallel (and separated threads) ?
>

No.


>
> Typical workload case (that would be awesome if it could be 4 times
> quicker) =
>
> 
> - a long treatment reading  a big table, which could be splitted into N
> treatments reading 1/N th of the records of the big table.
> - the N thread are adding the resulting records in 1 table.
>
> This typical workload could also be defined as a big CTE :
>with
>   resulset1 as (...),
>   resulset2 as (...),
>   resulset3 as (...),
>   resulset4 as (...)
>   select * from resultset1
>   union all
>   select * from resultset2
>   union all
>   select * from resultset3
>   union all
>   select * from resultset4
>
> Regards,
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] about the apparently arriving soon "threads"

2014-04-05 Thread big stone
Hello,

I see the "threads" branch of fossil has a lot of activity and seems close
to be finalized.

Will it be activated by default on the downloadable executable for windows ?
Will it apply to parallelisable CTE expression ?

Will it be possible from 1 sqlite.exe command line (or 1 python
sqlite.execute) to launch several SQL in parallel (and separated threads) ?

Typical workload case (that would be awesome if it could be 4 times
quicker) =

- a long treatment reading  a big table, which could be splitted into N
treatments reading 1/N th of the records of the big table.
- the N thread are adding the resulting records in 1 table.

This typical workload could also be defined as a big CTE :
   with
  resulset1 as (...),
  resulset2 as (...),
  resulset3 as (...),
  resulset4 as (...)
  select * from resultset1
  union all
  select * from resultset2
  union all
  select * from resultset3
  union all
  select * from resultset4

Regards,
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users