> Just in case this wasn't apparent, you certainly *can* use psycopg2's
> bindings when you're in an otherwise SQLAlchemy app. Worst case you can
> retrieve a raw psycopg2 connection using connection.raw_connection and do
> whatever you need. If you truly have some use for LOBs, SQLAlchemy isn't
> preventing you from using it, it's just not providing any niceties around
> it. The fact that these unusual use cases are not ever prohibited by SQLA
> further raises the bar to adding first class support for them.
>
> Yes, this much I knew would work, its just clunkier than it could be.
>
>
> PG's BYTEA is NOT arbitrarily large, it has quite real practical limits
> and I've hit them regularly in storing compressed HDF5 documents in the
> database as part of a bench testing framework. The theoretical limit is
> 1GB but this limit is far less in practice (
> http://www.postgresql.org/message-id/cafj8pracfkoinp2uxeizod5krx29n2ofsoldh0w6ej7rxko...@mail.gmail.com
> ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html I'm not
> looking for streaming, retrieve/set binary buffers. It'd be nice to
> translate it transparently to HDF5 python in-memory objects (ie h5py).
>
> Python translation is very easy in SQLA, its just if you have special
> needs for SQL syntaxes, that's where special behaviors may be needed. So
> far it sounds like the only blocking factor is that bind_sql needs to
> distinguish between INSERT and UPDATE. that's not a terribly tall order
> though it is inconvenient in that the API would need a
> backwards-compatibility layer.
>
Perhaps you can keep backwards compatibility by introspecting the callbacks
to determine the arity of their arguments.
>
>
>
>
>
>>
>> As the feature involves SQL functions I don't think you'd be dealing only
>> with bind_processor(), the SQL functions themselves would probably be via
>> SQL-level processing, see
>> http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.
>> It's true these functions aren't given access to the compiler context where
>> you'd be able to discern INSERT from UPDATE, so I'd probably use two
>> variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply
>> these to an INSERT or UPDATE statement at runtime probably using a
>> @compiles decorator -
>> http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take
>> the existing LOB type and just swap it out for InsertLOB/UpdateLOB.
>> Adding compiler context to TypeEngine is something we could look into as a
>> feature as well so that bind_expression() has this available somehow and
>> switching around types wouldn't be needed.
>>
>
> Separate datatypes is clearly not a good approach to this.
>
>
> Let me clarify that these separate datatypes would be totally invisible to
> the user. The user would work with a single LOB type. Translation to
> Insert/Update versions would not be explicit and would occur at the point
> at which the insert/update construct is compiled. At the moment this may
> be the best approach short of modifying the library (but then again I can't
> say much about the approach because i have little idea what the SQL we're
> talking about looks like).
>
Ah. I thought it would be done in the declarative table descriptions.
I'm not sure, being relatively new to SQLA, that would look like to do.
>
>
> Everything starts with a transaction block.
>
> the psycopg2 DBAPI is implicitly in a transaction block at all times
> unless "autocommit" mode is set up, so there shouldn't be any problem here.
>
> AFAIK it won't work with autocommit but I don't have documentation to
prove that, I just remember getting errors in I think JDBC pertaining to it.
>
> Hand crafted inserts use server side lo_create(-1) (in sql) which
> allocates a new large object and returning clause to get the oid in one go
> back to the client side. Then I start using the lobject api on the
> returned oid.
>
> I'd like to see explicit SQL, preferably in the form of a psycopg2 script
> that illustrates all the operations you wish to support and specifically
> how they must interact with the database.
>
>
I've attached example usages. As I indicated in my prior email, right now
I only do inserts/selects.
>
> In patches I've made to cppdb, in the bind of std::istream I call
> client side lo_create, lo_open, a sequence of lo_writes, and finally
> lo_close. In retrospect this doesn't handle updates particularly well and
> maybe this could be used in sqlalchemy as is too - an update just allocates
> a new large object and unreferences the old one.
>
> I wonder why the mechanics of these functions can't be more cleanly
> wrapped into server-side stored procedures? If they can be distilled
> into simple insert_lob() update_lob() functions that would potentially be a
> better separation of duties.
>
>
I think it's because only "small" lobs could rely on that - huge lobs that
exceed working memory fit better with a file like api. It would be a nice
convenience to have, perhaps. These same "small lobs" (ie lobs that fit in
working memory - keeping in mind modern commodity servers easily get 64GB)
are the kind that I would expect SQLA to deal with although if they binded
file like objects and buffers, that would allow both cases to be covered.
I don't think I'll be writing code that uses those stored procedures but
they could be nice conveniences. If SQLA managed to support these lobs
well however, I don't think users would use it directly.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sqlalchemy/4TCg_TVLgHc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
def chunklengths(N, n):
"""Produce `n`-sized chunks from `N`."""
for start in range(0, N, n):
seqlen = N - start if start + n >= N else n
yield seqlen
def chunks(s, n):
"""Produce `n`-character chunks from `s`."""
offset = 0
for chunklen in chunklengths(len(s), n):
yield s[offset:(offset + n)]
offset = offset + n
with conn.cursor() as cur:
cur.execute(
'select dataset, flight, run, input.icd from input inner join run_table on run_table.id = input.flight_run_id where input.id = %s;',
(input_id,)
)
dataset, flight, run, icd = cur.fetchone()
flight_path = pjoin(dbroot, dataset, 'flights', flight)
run_path = pjoin(flight_path, 'runs', run)
trace_path = pjoin(run_path, icd, 'trace.h5')
settings = file(pjoin(flight_path, 'conf', 'settings.yaml'), 'r').read()
icd_output = file(pjoin(run_path, icd, 'icd_output.bin'), 'rb').read()
print 'inserting : sizes: conf: %d icd_output: %d trace: %d'%(len(settings), len(icd_output), os.path.getsize(trace_path))
cur.execute(
"INSERT INTO pr_result(result_set, configuration, configuration_md5, input_id, icd_output, trace_oid) values ("
"%(result_set)s, %(configuration)s, md5(%(configuration)s), %(input_id)s, %(icd_output)s, lo_creat(-1) ) "
"RETURNING id, trace_oid;",
{'result_set': opts.result_set, 'configuration': settings, 'input_id': input_id, 'icd_output': Binary(icd_output)}
)
rid, trace_oid = cur.fetchone()
print 'uploading trace'
trace_remote = conn.lobject(trace_oid, 'rw')
trace_str = file(trace_path, 'rb').read()
for chunk in chunks(trace_str, 1024*1024*10):
trace_remote.write(chunk)
trace_remote.close()
conn.commit()
print 'Inserted result %s as id = %d'%(run_path, rid)
#pull to a file example, in many cases one will pull to memory instead
with conn.cursor() as cur:
cur.execute(
"select trace_oid where result.id = %s;"
{'id': opts.id}
)
trace_oid = cur.fetchone()
trace_remote = conn.lobject(trace_oid, 'r')
trace_remote.seek(0, 2) #SEEK_END
nbytes = trace_remote.tell()
trace_remote.seek(0, 0) #SEEK_SET
print 'pulling trace nbytes = %d'%(nbytes,)
with file(trace_path, 'wb') as trace_file:
for chunksize in chunklens(nbytes, 1024*1024*10):
chunk = trace_remote.read(chunksize)
trace_file.write(chunk)
if chunk == 0:
break
trace_remote.close()
conn.commit()