For batch imports like this, you might prefer to talk straight to the
SQL layer of SQLAlchemy instead of going through the ORM layer which
is very nice but incurs an heavy overhead. So I suggest you do
something like:

Gene_data.table.insert().execute([a block of a few lines of data])

See
http://www.sqlalchemy.org/docs/sqlconstruction.html#sql_insert
for details...

The only drawback to that is that you have to fill in the foreign_keys
yourself. Anyway, 4 rows per second seem even slower than what it
should do (I can't really tell, never benchmarked the thing), so if
you don't want to skip the ORM layer for inserts, please post (a
trimmed-down version) of your script here so that we can see if we can
get it to go any faster. But you have to know that we'll never achieve
near-SQL-layer-only speeds.

On 7/18/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> I am new to SQL/ORM/SQLAlchemy/Elixir in general although I've been
> reading for a while about all of them. I've decided on using Elixir
> for a recent application where I transferring a flat file DB that I
> have to SQLite, initially. Then I will focus on the data manipulation.
>
> My issue is that writing to this DB has turned out to be extremely
> slow. By my accounts it seems that it is writing on average 4 rows/
> second on a P4 4.2 GHz. The txt file has 400,000 rows so it is going
> to take too much time.
>
> My question is if this what is to be expected or I am doing something
> wrong? Basically I am iterating through the text file reading the data
> and instantiating the objects that represent the data. At the end of
> each loop I do a flush.
>
> Below are the objects as I've defined. The Gene_data table is the one
> that will hold most of the data, while the other tables are small
> (Around 100 rows at most).
>
> Thanks!
>
>
> class Gene_data(Entity):
>         has_field('start',Integer)
>         has_field('stop',Integer)
>         has_field('num_sources',Integer)
>         has_field('num_overlap',Integer)
>         has_field('strand',Boolean)
>         has_field('palindromic',Boolean)
>         has_field('other_bacteria',Boolean)
>         has_field('sequence',Unicode)
>         belongs_to('ecoli',of_kind='Ecoli_data',required=True)
>         belongs_to('source',of_kind='Source_data',required=True)
>         belongs_to('info',of_kind='Info_data',required=True)
>
> class Ecoli_data(Entity):
>         has_field('ecoli',Unicode,unique=True)
>         has_many('genes',of_kind='Gene_data')
>
> class Source_data(Entity):
>         has_field('source',Unicode,unique=True)
>         has_many('genes',of_kind='Gene_data')
>
> class Info_data(Entity):
>         has_field('info',Unicode,unique=True)
>         has_many('genes',of_kind='Gene_data')
>
>
> >
>


-- 
Gaƫtan de Menten
http://openhex.org

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"SQLElixir" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlelixir?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to