Hello. I would like to collect some advance feedback on a new module (or small collection thereof) which I have already been working on as part of a larger effort, but which I am considering releasing as a separate project. I had already brought up this issue on [EMAIL PROTECTED] and Tim Bunce suggested that [EMAIL PROTECTED] would be the best place to ask potential users about its viability.
You should read the email to the end before replying; the questions are at the end. Please reply both to [EMAIL PROTECTED] and to [EMAIL PROTECTED], as I am not subscribed to the mailing list, and to keep everyone in the loop; I would do the same. The module I propose will implement structured container objects that model or describe all kinds of de-serialized SQL schema and instructions. Each object is equivalent to either a single SQL statement or an ordered collection of them (such as a stored SQL procedure or a database schema). This is similar to how an XML DOM is equivalent to an XML Document; an XML Document is a serialized form of an XML DOM, and an XML DOM is a de-serialized form of an XML Document. The idea is that my objects can be used by other database-using applications or modules in place of SQL strings. An advantage of the objects is that you can build SQL simply by plugging individual scalar values (like names of table columns or names of functions or literal data values) into slots, which are as easy to set as individual elements in arrays or hashes. Similarly, it is easy to read individual attributes from SQL instructions, like reading array elements. Using the objects makes it a lot easier to, for example, analyze a select/view definition in order to generate appropriate insert/update/delete instructions against selected data. You never have to parse or concatenate or escape strings at any time. Using the objects is especially well suited for applications that make use of a data dictionary to describe what they do, as individual scalar values in the data dictionary can just be straight copied between the data dictionary and appropriate places in my object model. This is a lot easier than having to generate SQL yourself from said data dictionaries, and you can save on common mistakes such as not escaping your data. Of course, since many databases use serialized SQL as their standard input (as does the DBI module), my objects would have to be serialized at an appropriate time. However, serializing and de-serializing/parsing SQL is not and will never be handled by my module. Rather, this can be done by any third party generic modules of your choice, each of which would specialize in some way or other. Said modules would usually specialize in what specific database they are relating to, which have their own special ways of saying the same things. (Outer join syntax, anyone?) And to get things rolling, I will probably make a serializing module myself, as an example, unless someone else is quick to step up and do it. Since SQL is not identical with each database, what my object model would actually do is model a normalized superset of the features of different databases. When I say normalized, I mean that if there is a single task that is known by several different names or syntaxes (such as outer-joins), it will have exactly one name in the model. When I say superset, I mean that if a database supports a useful feature but other databases don't, it will still be represented in the model so people using that database can access it; but you can only use the feature with that database, or alternately with other databases for which someone decided to emulate the feature. But so long as the actual functionality you want is supported, your applications written against my object model will never have to be changed for the small ideosyncosies (sp?) of various databases that have features you use. Sort of in the middle between those two extremes is functionality which is not natively supported in some databases, but that is simple enough to emulate using built-in functionality (for example, emulating "left()" with "substr()"); in such cases, it is expected that when you use such features they will work with any database. For example, if you want a model-specified boolean data type, you will always get it, whether it is implemented on a per-database-basis as a "boolean" or an "int(1)" or a "number(1,0)". Or a model-specified "str" data type you will always get it, whether it is called a "text" or a "varchar2" or a "sql_varchar". Basically, my SQL object-model is tasked simply with describing, in a non-ambiguous and structured manner, the schemas or instructions that you want a database to do. The object-model doesn't know anything specific to a particular database; it doesn't even care whether the database uses SQL as its input; an intermediary module can translate the object-model into some other kind of non-SQL output for them. The object-model simply defines an ideal complete capabilities list and works towards those; and the databases just implement whichever parts of that list they can. More complete databases like Oracle will probably implement almost all of the list, while MySQL 3.x would implement much less of it natively. This is how I keep my module simple and easy to use. Much like DBI itself is fundamentally a simple and easy to use interface, while the separate DBD plug-in modules take care of the implementation details. I note that my object-model is intended to model all of SQL in a deserialized form. This is much unlike all of the other SQL-generating or parsing modules I have seen on CPAN, which are mainly limited to the most common tasks, which are select/insert/update/delete, and creating/removing tables. And even those usually don't support a lot of SQL's features for selects. And the few frameworks that I did see which recognized all the other main parts of a database schema, they simply extract or set serialized SQL which users have to write themselves. My object model will represent all of these things, in a fully deserialized form: table definitions (with multiple-column unique keys and foreign keys), view definitions (also known as named select statements), definitions for stored procedures, stored functions, triggers, sequences, and Oracle's packages (do other databases have a concept like 'packages'?); select statements or views or cursor definitions can be arbitrarily complex, including multiple-column table joins, outer joins, unions, Oracle's start-with-connect-by, group-by, have subqueries in the 'from' or 'where' clauses, use various built-in functions (like 'substr', 'concat', '+','*','log','count','sum','and','or','=', '<', more) almost anywhere, call user-defined stored functions (if the underlying database allows it), and more. I will note that an advantage of having these definitions fully parsed is that, for example, if a database doesn't support built-in stored procedures, a "sql generator" could alternately generate an anonymous Perl function that does the same thing, to be called instead. Now, I will note that some of the features may not be important to many people; eg: stored procedures. However, if you are working on any projects of significant size, you can get an idea that these things would actually be a large speed, reliability, and security savings for you. Look at many large corporate or government systems. I believe that if these advanced features can also be represented in a database-neutral format, then users can get the full power of a database without being locked into a single vendor due to all their investment in stored procedures et al. If customers can move a lot more easily, it will help encourage database vendors to keep improving their products or lower prices to keep their customers, and users in general benefit. So I do have reasons for trying to tackle the advanced database features in my database abstraction effort. So I think I would be bringing a lot of new functionality to CPAN, and not be making "yet another SQL generator module". What I am proposing is that my object-model module could be quite useful to makers of other database-related modules, as sort of a useful (or, dare I say it, "standard") interchange or intermediate format, becoming as ubiquitous in the database area as basic arrays or hashes are elsewhere, or such as XML DOMs are in many places. And this isn't vaporware, per se. I have the full intention of making this and making it quickly, should there be the demand (or if nothing else, to use myself). That said, it will take more time or people to move this beyond alpha or beta quality within a short time. But it shouldn't be long before you can play with it. Only if I shall become incapacitated soon shall this die. Anyway, getting to the point, I wanted to solicit your opinions or feedback, as users of DBI and related modules. 0. Does it look like I am blowing smoke or bull and am unlikely to deliver on what I propose? 1. Does my module proposal sound like something you would want to use, either in your applications or your own generic database modules? 2. Alternately, do you have trouble following what my proposal is, and so wouldn't be able to judge whether it might be useful? 3. If such a module would be useful, what are some specific features or priorities you would want it to have? 4. Are you aware of any existing modules or frameworks on CPAN that already do what I am proposing? If so, please point them out to me, as all of the similar modules I found do not do some of the major features, as I interpreted their documentation. What are their names, and where are they found. 5. While I will, of course, be putting out my module under an open-source licence, is it important to you that a certain license or licenses are used or not used? For example, if I put it out under plain GPL, would that provide any roadblocks to your use of the module? Or would you only use it if you had the choice between GPL and Artistic (as is Perl itself). Alternately, would you use it if the licensing terms were like MySQL? 6. Does this module sound like something you would want to work on yourself, either to get it done sooner or to ensure it has the features you want? Not that I expect anyone to, but I ask anyway. So, I appreciate any feedback you can give me. Thank you very much in advance. Please reply both to [EMAIL PROTECTED] and to [EMAIL PROTECTED], as I am not subscribed to the mailing list, and to keep everyone in the loop; I would do the same. Have a good day. -- Darren Duncan
