Re: [Zope-dev] Zope and Storm (SQL)

2008-05-25 Thread Andreas Jung



--On 24. Mai 2008 18:51:17 +0200 Brian Sutherland <[EMAIL PROTECTED]> 
wrote:



On Sat, May 24, 2008 at 06:23:01PM +0200, Andreas Jung wrote:



--On 24. Mai 2008 17:57:17 +0200 Brian Sutherland
<[EMAIL PROTECTED]>  wrote:


On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:

- how do you deal with hierarchical or semi-structured data
 (e.g. SGML, XML)?


As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly
basic still, but you can create indexes based on xpath queries.


Jup. Postgres already had in version 7 XML support as an add-on...but
pretty much unusable for the production


Interesting, why not?

I'm finding it incredibly useful, but then maybe my needs are very
simple.


An important functionality would be being able to modify nodes of an XML 
document in-place. Storing and querying using xpath is only one part of the 
story.


Andreas



pgp0SOqQrA6sp.pgp
Description: PGP signature
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope and Storm (SQL)

2008-05-25 Thread David Pratt
Hi Jurgen. Thank you for keeping this thread going. It is helpful and 
good to experiment with this further for light applications. I have not 
given much attention to storage with orm since the early days of 
sqlalchemy. There have been many improvements since then to the various 
implementations. I believe Storm may be a bit more precise as to what it 
wants to be. That translates into a bit more speed from all accounts.


Jürgen kartnaller wrote:
On Sat, May 24, 2008 at 2:24 PM, David Pratt <[EMAIL PROTECTED] 
> wrote:


Hi Jurgen. Thank you for this informative post. I am particularly
interested in how this fits into the refactoring of legacy code. I
appreciate your sharing your experiences. I also think everyone has
been paying much attention to the insight Lovely has been sharing on
scaling and efficient delivery of Zope over the past couple of years.

As a means of mapping one or more backends without changing the
logic or code with backend logic, schemas play an important role. I
can see the benefit of providing plain SQL statements since they are
clearly understood. The concern I have about not using schemas is
the loss of integration potential for the different backends using a
common pattern of mapping zope schema to xml, rdf, rdb, whatever ...
In your opinion, is this abstraction simply costing too much,
unnecessary, or a matter of application development and runtime speed.




Schemas provide the potential of mapping the object structure to other 
storage backends with different notions of persistence - so not a direct 
reference for xml to sql. As much as possible I do not want to change an 
app to use a different backend. With adaptation and classes that behave 
the same way in CA, it ought to be possible to only have to change 
import statements to different implementations of Contained, Location 
etc and generally change what you are subclassing from to persist data.


I'm not sure what you mean with schema/xml in context with SQL. You can 
still use your schema as you already do. I just wrote a CSV importer 
based on schema's together with formlib's Editform and AddForm.
The only diference you have is, that it is not possible to use 
"FieldProperty" in Storm classes.




For me, the crux of the rdb approach for legacy code is the
container, location and traversal. You have been very generous with
your examples. I am really hoping for a clearer idea of handling
Container, OrderedContainer, and Location which is prevalent in
legacy code. Overall, I can say that I quite the innovation here in
getting to a 'leaner' concept of Zope.


If you have your legacy code you have a clear definition what you need 
for your container. So it should be straight forward to implement 
IContainer and ILocated.


Yes, you are right. I'll experiment with what you have provided and see 
if I can get something basic working with the z3c.traverser package. 
BTW, I am assuming at this point that everything is registered in the 
global site manager for the app. I am not sure if it is possible to have 
a notion of local sites without the ZODB. This of course would required 
changes in legacy applications as well.



Without going too deep into this here is some code which should be usable:

class Container(Storm):
interface.implements(IContainer)
__storm_table__ = 'containers'
id = Int(primary=True)
content = ReferenceSet(id, 'Content.id')
def __iter__(self):
return self.content
def __getitem__(self, name):
item = self.content.find(Content.name == name).one()
if item is None:
raise KeyError
return item
self __setitem__(self, name, item):
item.name  = name # add namechooser things here
item.parent = self
def __len__(self):
return self.content.count()

class Content(Storm):
id = Int(primary=True)
name = Unicode()
parent = Reference(id, Container.id)


No worry about killing anyone :-) I appreciate this sketch. I believe 
Martijn and Brian were discussing a generic implementation of Container 
for the different alchemy flavors a few weeks back - an implementation 
that could live outside of these packages. It would be useful to have a 
decent generic implementation for rdb implementations. Many thanks.


Regards,
David


Don't kill me if something is wrong here, this is an untested quick hack 
to demonstrate what's possible. Also the IContainer interface is not 
fully implemented.




Regards,
David



Jürgen kartnaller wrote:

There seems to be some interest on the use of SQL databases with
Zope.

Lovelysystems is now using SQL databases as the primary storage
for their applications. We use Zope and Postgres with Storm as ORM.
The main reason for switching to SQL database were speed issues
with queries.

Here is a short sum

Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Jürgen kartnaller
On Sat, May 24, 2008 at 2:24 PM, David Pratt <[EMAIL PROTECTED]> wrote:

> Hi Jurgen. Thank you for this informative post. I am particularly
> interested in how this fits into the refactoring of legacy code. I
> appreciate your sharing your experiences. I also think everyone has been
> paying much attention to the insight Lovely has been sharing on scaling and
> efficient delivery of Zope over the past couple of years.
>
> As a means of mapping one or more backends without changing the logic or
> code with backend logic, schemas play an important role. I can see the
> benefit of providing plain SQL statements since they are clearly understood.
> The concern I have about not using schemas is the loss of integration
> potential for the different backends using a common pattern of mapping zope
> schema to xml, rdf, rdb, whatever ... In your opinion, is this abstraction
> simply costing too much, unnecessary, or a matter of application development
> and runtime speed.


I'm not sure what you mean with schema/xml in context with SQL. You can
still use your schema as you already do. I just wrote a CSV importer based
on schema's together with formlib's Editform and AddForm.
The only diference you have is, that it is not possible to use
"FieldProperty" in Storm classes.


>
> For me, the crux of the rdb approach for legacy code is the container,
> location and traversal. You have been very generous with your examples. I am
> really hoping for a clearer idea of handling Container, OrderedContainer,
> and Location which is prevalent in legacy code. Overall, I can say that I
> quite the innovation here in getting to a 'leaner' concept of Zope.


If you have your legacy code you have a clear definition what you need for
your container. So it should be straight forward to implement IContainer and
ILocated.

Without going too deep into this here is some code which should be usable:

class Container(Storm):
interface.implements(IContainer)
__storm_table__ = 'containers'
id = Int(primary=True)
content = ReferenceSet(id, 'Content.id')
def __iter__(self):
return self.content
def __getitem__(self, name):
item = self.content.find(Content.name == name).one()
if item is None:
raise KeyError
return item
self __setitem__(self, name, item):
item.name = name # add namechooser things here
item.parent = self
def __len__(self):
return self.content.count()

class Content(Storm):
id = Int(primary=True)
name = Unicode()
parent = Reference(id, Container.id)

Don't kill me if something is wrong here, this is an untested quick hack to
demonstrate what's possible. Also the IContainer interface is not fully
implemented.


>
> Regards,
> David
>
>
>
> Jürgen kartnaller wrote:
>
>> There seems to be some interest on the use of SQL databases with Zope.
>>
>> Lovelysystems is now using SQL databases as the primary storage for their
>> applications. We use Zope and Postgres with Storm as ORM.
>> The main reason for switching to SQL database were speed issues with
>> queries.
>>
>> Here is a short summary of my thougt's and experiences while using Storm
>> and Zope for about 3 Month now.
>>
>>
>> RelStorage:
>> Relstorage doesn't solve the speed problems. Doing queries with SQL is
>> much faster than doing it with ZODB. If you work with a lot and with large
>> BTrees you need to load them all into the memory of each Zope client. This
>> has to be done with Relstorage too.
>>
>>
>> Indexes:
>> You don't need to implement catalog indexes, this is all done on the
>> database side. When implementing and using your content types, at first you
>> don't need to think about indexes, later you optimize the database without
>> touching your python code.
>>
>> A speed example :
>> We had to find similar users based on items a user has collected. Doing
>> this with ZODB took minutes to calculate for users with a lot of items. We
>> had to implement a lot of code to do the calculation asynchronously to not
>> block the users request.
>> Doing the same with SQL was possible with a single (of course complex)
>> query within 300ms, no async things needed, just implement the query and
>> optimize the indexes on the server, finished ! Relstorage will not help you
>> here.
>>
>>
>> Content implementation:
>> While we are porting our existing ZODB based packages to SQL, we found
>> that implementing them with Storm is as easy as using ZODB. We still can use
>> the full power of Zope's component architecture. This is because Storm
>> objects are extremely easy to implement. You can implement a storm object
>> like a Persistent object, just derive from Storm instead of Persistent, add
>> __storm_table__ and define the properties as Storm properties.
>>
>> For me a big mistake when switching from ZODB to SQL is trying to use the
>> container pattern at any cost.
>> A container is nothing but a  1:N relation and this is exactly what an SQL
>> database provides :

Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Brian Sutherland
On Sat, May 24, 2008 at 06:23:01PM +0200, Andreas Jung wrote:
>
>
> --On 24. Mai 2008 17:57:17 +0200 Brian Sutherland <[EMAIL PROTECTED]> 
> wrote:
>
>> On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:
>>> - how do you deal with hierarchical or semi-structured data
>>>  (e.g. SGML, XML)?
>>
>> As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly
>> basic still, but you can create indexes based on xpath queries.
>
> Jup. Postgres already had in version 7 XML support as an add-on...but 
> pretty much unusable for the production 

Interesting, why not?

I'm finding it incredibly useful, but then maybe my needs are very
simple.

> and the functionality of the XML 
> type in Postgres 8 looks similarly (feature-wise).
>
> Andreas



-- 
Brian Sutherland
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Andreas Jung



--On 24. Mai 2008 17:57:17 +0200 Brian Sutherland <[EMAIL PROTECTED]> 
wrote:



On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:

- how do you deal with hierarchical or semi-structured data
 (e.g. SGML, XML)?


As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly
basic still, but you can create indexes based on xpath queries.


Jup. Postgres already had in version 7 XML support as an add-on...but 
pretty much unusable for the production and the functionality of the XML 
type in Postgres 8 looks similarly (feature-wise).


Andreas

pgpnHAvGgycYX.pgp
Description: PGP signature
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Brian Sutherland
On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:
> - how do you deal with hierarchical or semi-structured data
>  (e.g. SGML, XML)?

As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly
basic still, but you can create indexes based on xpath queries.

I think future versions will be able to use a schema as a constraint for
a column.

-- 
Brian Sutherland
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread David Pratt
Hi Jurgen. Thank you for this informative post. I am particularly 
interested in how this fits into the refactoring of legacy code. I 
appreciate your sharing your experiences. I also think everyone has been 
paying much attention to the insight Lovely has been sharing on scaling 
and efficient delivery of Zope over the past couple of years.


As a means of mapping one or more backends without changing the logic or 
code with backend logic, schemas play an important role. I can see the 
benefit of providing plain SQL statements since they are clearly 
understood. The concern I have about not using schemas is the loss of 
integration potential for the different backends using a common pattern 
of mapping zope schema to xml, rdf, rdb, whatever ... In your opinion, 
is this abstraction simply costing too much, unnecessary, or a matter of 
application development and runtime speed.


For me, the crux of the rdb approach for legacy code is the container, 
location and traversal. You have been very generous with your examples. 
I am really hoping for a clearer idea of handling Container, 
OrderedContainer, and Location which is prevalent in legacy code. 
Overall, I can say that I quite the innovation here in getting to a 
'leaner' concept of Zope.


Regards,
David



Jürgen kartnaller wrote:

There seems to be some interest on the use of SQL databases with Zope.

Lovelysystems is now using SQL databases as the primary storage for 
their applications. We use Zope and Postgres with Storm as ORM.
The main reason for switching to SQL database were speed issues with 
queries.


Here is a short summary of my thougt's and experiences while using Storm 
and Zope for about 3 Month now.



RelStorage:
Relstorage doesn't solve the speed problems. Doing queries with SQL is 
much faster than doing it with ZODB. If you work with a lot and with 
large BTrees you need to load them all into the memory of each Zope 
client. This has to be done with Relstorage too.



Indexes:
You don't need to implement catalog indexes, this is all done on the 
database side. When implementing and using your content types, at first 
you don't need to think about indexes, later you optimize the database 
without touching your python code.


A speed example :
We had to find similar users based on items a user has collected. Doing 
this with ZODB took minutes to calculate for users with a lot of items. 
We had to implement a lot of code to do the calculation asynchronously 
to not block the users request.
Doing the same with SQL was possible with a single (of course complex) 
query within 300ms, no async things needed, just implement the query and 
optimize the indexes on the server, finished ! Relstorage will not help 
you here.



Content implementation:
While we are porting our existing ZODB based packages to SQL, we found 
that implementing them with Storm is as easy as using ZODB. We still can 
use the full power of Zope's component architecture. This is because 
Storm objects are extremely easy to implement. You can implement a storm 
object like a Persistent object, just derive from Storm instead of 
Persistent, add __storm_table__ and define the properties as Storm 
properties.


For me a big mistake when switching from ZODB to SQL is trying to use 
the container pattern at any cost.
A container is nothing but a  1:N relation and this is exactly what an 
SQL database provides : Relations


class Content(Storm):
id = Int(primary=True)
content = ReferenceSet(id, 'Contained.somethingId')
c = Content()

Now you can
 - add data : c.content.add(content)
 - iterate : for a in c.content:
 - search : c.content.find(...)
 - sort : c.content.find().sort_by(...)
 - do anything a Storm ResultSet is providing

But of course it is possible to put an adapter around the Content class 
which will provide IContainer.



Annotation:
Annotations are 1:1 relations, so it's as easy as the above.
We use annotations like simple adapters to other tables.

class ToBeAnnotated(Storm):
interface.implements(ICanHaveData)
id = Int(primary=True)

Note that the "annotated" storm table is implemented as an adapter :

class Data(Storm):
interface.implements(IData)
interface.adapts(ICanHaveData)
id = Int(primary=True)
__parent__ = Reference(id, ToBeAnnotated.id)
def _init__(self, context):
# a dummy to make the adapter happy
pass

We can now register "Data" as an adapter.
We use a special adapter factory like zope.annotation.factory to 
autocreate adapted content.


def contentAdapter(table, autocreate=True):
# an adapter on content for content contained in other tables. Just like
# the annotation adapter, an instance is created if autocreate is True.
adapts = component.adaptedBy(table)
if adapts is None:
raise TypeError("Missing 'zope.component.adapts' on table")
@component.adapter(list(adapts)[0])
@interface.implementer(list(component.implementedBy(table))[0])
def getAdapter(context):
   

Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Jürgen kartnaller
On Sat, May 24, 2008 at 11:28 AM, Andreas Jung <[EMAIL PROTECTED]> wrote:

>
>
> --On 24. Mai 2008 10:47:54 +0200 Jürgen kartnaller <
> [EMAIL PROTECTED]> wrote:
>
>  There seems to be some interest on the use of SQL databases with Zope.
>>
>> Lovelysystems is now using SQL databases as the primary storage for their
>> applications. We use Zope and Postgres with Storm as ORM.
>> The main reason for switching to SQL database were speed issues with
>> queries.
>>
>>
> Thanks for this interesting mail.
>
> Some questions:
>
> - how to you deal with binary large objects?


z3c.extfile


>
>
> - how do you deal with changes in the data model and
>  related migrations?


Our SQL-Scripts contain alter-table to update existing tables.


>
>
> - how do you deal with hierarchical or semi-structured data
>  (e.g. SGML, XML)?


We have no use case for this right now.


Jürgen
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Andreas Jung



--On 24. Mai 2008 10:47:54 +0200 Jürgen kartnaller 
<[EMAIL PROTECTED]> wrote:



There seems to be some interest on the use of SQL databases with Zope.

Lovelysystems is now using SQL databases as the primary storage for their
applications. We use Zope and Postgres with Storm as ORM.
The main reason for switching to SQL database were speed issues with
queries.



Thanks for this interesting mail.

Some questions:

- how to you deal with binary large objects?

- how do you deal with changes in the data model and
 related migrations?

- how do you deal with hierarchical or semi-structured data
 (e.g. SGML, XML)?

We are working and building hybrid solutions on top of Zope 2 since years
where parts of the data is stored within the ZODB and metadata is stored 
within Postgres and this worked out pretty well. I would disagree that a 
RDBMS is the best solution for every particular usecase. The decision for 
the backend storage must be made for every project - either use the ZODB, a 
RDBMS or a mixture of both.


Andreas




pgpcqhwfWJbE9.pgp
Description: PGP signature
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Zope and Storm (SQL)

2008-05-24 Thread Jürgen kartnaller
There seems to be some interest on the use of SQL databases with Zope.

Lovelysystems is now using SQL databases as the primary storage for their
applications. We use Zope and Postgres with Storm as ORM.
The main reason for switching to SQL database were speed issues with
queries.

Here is a short summary of my thougt's and experiences while using Storm and
Zope for about 3 Month now.


RelStorage:
Relstorage doesn't solve the speed problems. Doing queries with SQL is much
faster than doing it with ZODB. If you work with a lot and with large BTrees
you need to load them all into the memory of each Zope client. This has to
be done with Relstorage too.


Indexes:
You don't need to implement catalog indexes, this is all done on the
database side. When implementing and using your content types, at first you
don't need to think about indexes, later you optimize the database without
touching your python code.

A speed example :
We had to find similar users based on items a user has collected. Doing this
with ZODB took minutes to calculate for users with a lot of items. We had to
implement a lot of code to do the calculation asynchronously to not block
the users request.
Doing the same with SQL was possible with a single (of course complex) query
within 300ms, no async things needed, just implement the query and optimize
the indexes on the server, finished ! Relstorage will not help you here.


Content implementation:
While we are porting our existing ZODB based packages to SQL, we found that
implementing them with Storm is as easy as using ZODB. We still can use the
full power of Zope's component architecture. This is because Storm objects
are extremely easy to implement. You can implement a storm object like a
Persistent object, just derive from Storm instead of Persistent, add
__storm_table__ and define the properties as Storm properties.

For me a big mistake when switching from ZODB to SQL is trying to use the
container pattern at any cost.
A container is nothing but a  1:N relation and this is exactly what an SQL
database provides : Relations

class Content(Storm):
id = Int(primary=True)
content = ReferenceSet(id, 'Contained.somethingId')
c = Content()

Now you can
 - add data : c.content.add(content)
 - iterate : for a in c.content:
 - search : c.content.find(...)
 - sort : c.content.find().sort_by(...)
 - do anything a Storm ResultSet is providing

But of course it is possible to put an adapter around the Content class
which will provide IContainer.


Annotation:
Annotations are 1:1 relations, so it's as easy as the above.
We use annotations like simple adapters to other tables.

class ToBeAnnotated(Storm):
interface.implements(ICanHaveData)
id = Int(primary=True)

Note that the "annotated" storm table is implemented as an adapter :

class Data(Storm):
interface.implements(IData)
interface.adapts(ICanHaveData)
id = Int(primary=True)
__parent__ = Reference(id, ToBeAnnotated.id)
def _init__(self, context):
# a dummy to make the adapter happy
pass

We can now register "Data" as an adapter.
We use a special adapter factory like zope.annotation.factory to autocreate
adapted content.

def contentAdapter(table, autocreate=True):
# an adapter on content for content contained in other tables. Just like
# the annotation adapter, an instance is created if autocreate is True.
adapts = component.adaptedBy(table)
if adapts is None:
raise TypeError("Missing 'zope.component.adapts' on table")
@component.adapter(list(adapts)[0])
@interface.implementer(list(component.implementedBy(table))[0])
def getAdapter(context):
unsafeContext = removeSecurityProxy(context)
obj = getStore().find(table, table.__parent__ ==
unsafeContext).one()
if obj is None and autocreate:
obj = table(context)
obj.__parent__ = context
return obj
return getAdapter

Now you can define a factory for the adapter:

dataFactory = contentAdapter(Data)

And register "dataFactory" as an adapter.


DublinCore:
If you want to use the full DublinCore implementation from Zope you need to
do a generic implementation.
Usually only parts of the DublinCore interface is used.
We usually implement IDCTimes and IDCDescriptiveProperties. All you need to
do for this is :

class DCStormContent(Storm):
interface.implements(IDCTimes, IDCDescriptiveProperties)
created = DateTime()
modified = DateTime()
title = Unicode()
description = Unicode()

That's it!
You can now use IDCTimes and IDCDescriptiveProperties for your formlib
form_fields.

There are two way's to update "modified" :
 - write an event handler for ObjectModifiedEvent
 - do it on the database side with triggers
I prefer using the event handler because the database trigger is doing the
update only when writing to the database which can be to late.


Schema's and Storm objects:
We don't use schema's to create our Storm objects or the database table from