2011/1/5 Thomas Paviot <tpav...@gmail.com> > 2011/1/5 jelle feringa <jelleferi...@gmail.com> > > Guys, I'm not much of a DB expert, but a NoSQL db like cassandra is really >> built specifically for web 2.0 realtime applications with trillions of >> users. Query's are ran by code rather than sql statement and the DB is >> reduced to storing / retrieving key/values ( on a large number of machines >> ). > > > I'm not an expert as well, but I'm not sure that, as you write, "cassandra > is really built specifically for web 2.0 realtime applications with > trillions of users". This point does not appear on the projects highlights. > It has been designed to be scalable, durable and fault tolerant, which is > quite different. As far as I understood, the choice of Cassandra by Twitter > is not the result of the millions of users nor the size of the database, but > rather the exponential increase of size and number of users. This problem > (how to stay efficient in a rapid growing environment) is, in my opinion, > independent from web2.0 or real time applications. Furthermore, although > Cassandra (or CouchDB or Mango etc.) provide low latency for > syncing/writing/reading, they can't be considered as "real time". > > >> I'm pretty confident its _not_ practical for our sort of purpose. >> > > This assertion depends on the "sort of purpose", and I'm not as confident > as you are. > > >> I'm pretty sure it would be more productive to first mess about with >> something compact and efficient such as sqlite ( or whatever relational db >> ;) >> > > It's another aspect. If by "productive", you mean that it would be quicker > to develop such a solution, I agree. You posted a draft python code a few > days ago, that does work, and could be extended so that it covers the > complete API. > > On the other side, although I'm not an expert in DB systems, the SQL based > databases used in product data management has proven serious drawbacks. You > don't need to reach trillion of users to get the system fail, only a few > hundreds located all around the world are sufficient. I recently had a > feedback of an engineer working for the automotive industry: they have many > factories all around the world, with one PDM for the whole company. The main > server is located in Europe, and there is one server per continent mirroring > the main one. Replications are performed every one or two days, that is to > say that every engineer might work on outdated data. It's really a big deal, > for the collaborative work to be efficient, and also from an IT architecture > maintenance and cost. And they are not trillions of users. > > You might also have heard about the use of HDF5 to manage STEP data. One > simple user can create huge sets of CAD data in only one session. Maybe the > distributed databases like Cassandra will be an alternative solution to > HDF5. > > So the SQL solution would be a short term vision. Distributed and scalable > databases a long term vision. > > At last, I didn't get through the ORM mapper because I'm not interested in > doing something that has already be done (there is no real challenge) and > using technologies which are almost as old as me ;-) > > >> Than again, what the hell do I know. >> Please refer to this wiki entry <http://en.wikipedia.org/wiki/NoSQL> so >> you understand what schema-less DB's are good for. >> Why not rely use a decent ORM like sqlalchemy? >> > > In my opinion, the discussion can't be as deeply technical untill we have > discussed what you call the "purpose". According to me : > > - the 'CAD Collaborative Work' is a *really* important part of the high > level cad api that has to be designed. It is at the same level than > Creating/Visualizing CAD objects like shapes, vertices, splines etc. Both > has to be designed at the same time; > - the Geometrical/Topological/etc. part of the API has to be > independent from the underlying CAD kernel. For instance, a pythonOCC user > working with this high level API must *never* see any TopoDS_Vertex, > BRepPRimAPI_Something and all that mess ; > - the CAD collaborative Work must be independent from any database > technology and the user must *never* see any Oracle/MySQL/NoSQL/what else > database. The user just need to store/retrieve/find CAD objects whether > they > are stored in a python dict (in memory), in a text or xml file, in a local > or distant db etc. > > To conclude, I think a well designed HLA must be 'user oriented' instead of > 'technology oriented'. Let's focus on user needs, and implement after that > different technologies. That's what I called a few days ago a top-down > approach : go from user to implementation rather than technology to user. > From this point of view, the use of any technology remains possible. > > >> >> Cheers, >> > > Cheers! > > >> >> -jelle >> > > Jelle >
Stupid me! I'm Thomas, not Jelle ;)
_______________________________________________ Pythonocc-users mailing list Pythonocc-users@gna.org https://mail.gna.org/listinfo/pythonocc-users