Hi Lazaro and Massimo,

The Fi$cal project is not my project; I don't know how many tables it has, 
but is an example of highly relational financial software so that web2py 
members could understand my question clearly. Most projects will be in 
strict 3rd Normal Form.

Let me be more transparent regarding my goals.

I have a nonprofit government improvement 
organization<http://www.gov-ideas.com/>and part of our mission is to recommend 
software architecture to 
government. I am evaluating web2py as a candidate for prototyping 
applications. I'm very excited that Massimo's vision to lower the average 
person's programming entry barrier also works to improve government by 
allowing rapid prototyping. 

The next step is to assess how to best migrate prototypes to production. 
There maybe no further steps necessary, or for large projects like Fi$cal, 
there may be a consensus such as moving the entire project to 
Python/sqlAlchemy which provides the most stability, but allows copying of 
some web2py components for reuse.

It is a significant responsibility to make these recommendations and I need 
many diverse advisors to select products that reduce risk in government 
software development. 


   1. At this time is there a consensus regarding the top candidates for 
   open source business-oriented, highly relational software tools? 
   2. What would the top 10 look like? 
   3. What are the reasons that each one was selected? 
   4. At what points or areas would other products surpass web2py?


Any comments would be much appreciated,

Alex

On Saturday, December 29, 2012 8:47:42 AM UTC-8, Massimo Di Pierro wrote:
>
> I should add that in the system I have looked almost all tables where 
> de-normalized. All tabled storing a key would also stored the value 
> corresponding to the code. This is for two reasons: 1) if you have 
> thousands of tables you cannot join everything all the time. 2) for 
> auditing purposes it should be possible to change a value in a lookup table 
> without changing the corresponding previous values of records created 
> before.
>
> This means that while I needed the lookup tables for IS_IN_SET() 
> IS_IN_DB() validators, I never needed more than two joins.
>
> Again, the all design seems crazy to many of use but it has its plus sides 
> from an auditing point of view.
>
> Massimo
>
> On Friday, 28 December 2012 22:39:51 UTC-6, Massimo Di Pierro wrote:
>>
>> I have some experience with a large peoplesoft system where they claims 
>> thousands of tables. Turns out almost everything I needed was in less than 
>> 10 tables.
>>
>> Basically most of the tables were lookup tables used simply as key:value 
>> store for the possible values of a field. For example table GENDER 
>> {'M':'Male','F':'Female','O':'Other'}.
>>
>> So 10 tables with 100 columns each and one lookup table for each column 
>> gives you more than 10,000 tables. I suspects that is your case too.
>>
>> All systems that claim an insane amount of tables belong to this category.
>>
>> The way to handle it is to load all lookup tables in cache and use cache 
>> instead of database access to convert key<->value. In fact the values for 
>> lookup tables almost never change.
>>
>> I suggest before you embark in this venture do the following exercise: 
>> make a list of all table names. For each table make list of fields in the 
>> table and count the number of records (more or less).
>>
>> You will find many tables with less then 100 records and less then 10 
>> columns. You will find a few tables with more than 10 columns and more then 
>> 100000 records. You need to find out how many tables belong to one category 
>> and how many to the other.
>>
>> If this is the case, as I suspect, than you can use web2py but you need 
>> to setup some clever caching system to hable the lookup tables. It would be 
>> the same with other frameworks since you don't want to join everything all 
>> the time or your database will grind to a halt.
>>
>> It is also possible I am completely wrong in my assumption.
>>
>> In the case of the peoplesoft system I studied they also were storing all 
>> past versions of each record in the same table as the current record. 
>> Basically every record had had two dates (valid_from, valid_until). Current 
>> records had valid_until set to 2999-12-31. records would never be modified. 
>> The process for modifying a record consisted of creating a copy of the 
>> current record, editing the copy, setting the valid_until=now for the 
>> previous current record, updating all references pointing to the record. Af 
>> course all  tables used the same mechanism for versioning thus making the 
>> update process very slow and cumbersome, and all tables un-necessary large. 
>> Yet this simplifies auditing because you can go back to any moment in time 
>> simply by filtering records in a query.
>>
>> The reason I am explaining all of this is that probably you are going to 
>> have to deal with something like this. The problem is not web2py vs other 
>> framework. The problems will be that you need special logic to handle those 
>> tables which is foreign to web2py and many modern frameworks which simply 
>> assume more moder database design practices.
>>
>> My suggestion is start small and see what happens. Find who are your 
>> primary target users. Find which tables they need to access and create a 
>> web interface for those tables. You will probably be able to factorize the 
>> interaction with the database in many small apps.
>>
>> Massimo
>>
>>
>>
>> On Friday, 28 December 2012 09:38:25 UTC-6, Alex Glaros wrote:
>>>
>>> Can web2py be used for highly complex relational databases for large 
>>> fiscal projects? Example: California's Fi$cal project - 
>>> http://www.fiscal.ca.gov/<http://www.linkedin.com/redirect?url=http%3A%2F%2Fwww%2Efiscal%2Eca%2Egov%2F&urlhash=DBJm&_t=tracking_anet>
>>>  - with roughly 10,000 tables and many complex joins.
>>>
>>> What components of web2py would start to get slow or not work well when 
>>> having so many tables? 
>>>
>>> If web2py would instead be better used to prototype the Fi$cal system, 
>>> what would be good production-version candidates to migrate to? Pure Python 
>>> using sqlAlchemy? Java? Anything that would make migration easier such as 
>>> Python-based frameworks?
>>>
>>> Thanks,
>>>
>>> Alex Glaros
>>
>>

-- 



Reply via email to