Re: [Pythonocc-users] writing OCC data to a db

Thomas Paviot Wed, 05 Jan 2011 06:14:44 -0800

2011/1/5 jelle feringa <jelleferi...@gmail.com>

> Guys, I'm not much of a DB expert, but a NoSQL db like cassandra is really
> built specifically for web 2.0 realtime applications with trillions of
> users. Query's are ran by code rather than sql statement and the DB is
> reduced to storing / retrieving key/values ( on a large number of machines
> ).



I'm not an expert as well, but I'm not sure that, as you write, "cassandra
is really built specifically for web 2.0 realtime applications with
trillions of users". This point does not appear on the projects highlights.
It has been designed to be scalable, durable and fault tolerant, which is
quite different. As far as I understood, the choice of Cassandra by Twitter
is not the result of the millions of users nor the size of the database, but
rather the exponential increase of size and number of users. This problem
(how to stay efficient in a rapid growing environment) is, in my opinion,
independent from web2.0 or real time applications. Furthermore, although
Cassandra (or CouchDB or Mango etc.) provide low latency for
syncing/writing/reading, they can't be considered as "real time".


> I'm pretty confident its _not_ practical for our sort of purpose.
>

This assertion depends on the "sort of purpose", and I'm not as confident as
you are.


> I'm pretty sure it would be more productive to first mess about with
> something compact and efficient such as sqlite ( or whatever relational db
> ;)
>

It's another aspect. If by "productive", you mean that it would be quicker
to develop such a solution, I agree. You posted a draft python code a few
days ago, that does work, and could be extended so that it covers the
complete API.

On the other side, although I'm not an expert in DB systems, the SQL based
databases used in product data management has proven serious drawbacks. You
don't need to reach trillion of users to get the system fail, only a few
hundreds located all around the world are sufficient. I recently had a
feedback of an engineer working for the automotive industry: they have many
factories all around the world, with one PDM for the whole company. The main
server is located in Europe, and there is one server per continent mirroring
the main one. Replications are performed every one or two days, that is to
say that every engineer might work on outdated data. It's really a big deal,
for the collaborative work to be efficient, and also from an IT architecture
maintenance and cost. And they are not trillions of users.

You might also have heard about the use of HDF5 to manage STEP data. One
simple user can create huge sets of CAD data in only one session. Maybe the
distributed databases like Cassandra will be an alternative solution to
HDF5.

So the SQL solution would be a short term vision. Distributed and scalable
databases a long term vision.

At last, I didn't get through the ORM mapper because I'm not interested in
doing something that has already be done (there is no real challenge) and
using technologies which are almost as old as me ;-)


> Than again, what the hell do I know.
> Please refer to this wiki entry <http://en.wikipedia.org/wiki/NoSQL> so
> you understand what schema-less DB's are good for.
> Why not rely use a decent ORM like sqlalchemy?
>

In my opinion, the discussion can't be as deeply technical untill we have
discussed what you call the "purpose". According to me :

   - the 'CAD Collaborative Work' is a *really* important part of the high
   level cad api that has to be designed. It is at the same level than
   Creating/Visualizing CAD objects like shapes, vertices, splines etc. Both
   has to be designed at the same time;
   - the Geometrical/Topological/etc. part of the API has to be independent
   from the underlying CAD kernel. For instance, a pythonOCC user working with
   this high level API must *never* see any TopoDS_Vertex,
   BRepPRimAPI_Something and all that mess ;
   - the CAD collaborative Work must be independent from any database
   technology and the user must *never* see any Oracle/MySQL/NoSQL/what else
   database. The user just need to store/retrieve/find CAD objects whether they
   are stored in a python dict (in memory), in a text or xml file, in a local
   or distant db etc.

To conclude, I think a well designed HLA must be 'user oriented' instead of
'technology oriented'. Let's focus on user needs, and implement after that
different technologies. That's what I called a few days ago a top-down
approach : go from user to implementation rather than technology to user.
>From this point of view, the use of any technology remains possible.


>
> Cheers,
>

Cheers!


>
> -jelle
>

Jelle

_______________________________________________
Pythonocc-users mailing list
Pythonocc-users@gna.org
https://mail.gna.org/listinfo/pythonocc-users

Re: [Pythonocc-users] writing OCC data to a db

Reply via email to