I've never used Elixir so I can't really comment on the pros and cons. With my personal experience I would go with an ETL and copy the desired data directly into your ordering system. Just keep in mind that these new tables should stay read-only as far as the ordering system goes. The data would be there solely for ease of reference.
It really depends on where your skills lie. Play around with Elixir and see if you can get what you need out of it. I see only two things to keep in mind with Elixir and/or muti-db. 1) You won't be able to create real foreign keys because they are in different physical systems. Though you can still create columns in the ordering system that hold the primary keys of the other systems. 2) You won't be able to do real joins. You will have to query each system independently. And yes this will still be the case with the multi-db support when it is in place. Of course you will be able to intelligently build the queries based on the dataset in the ordering system. On Mon, Sep 28, 2009 at 3:13 PM, Tony Schmidt <[email protected]>wrote: > > "If you need to link one order item to other data in the warehouse it > should > only either be done in the warehouse or between the systems outside of > the > warehouse." > > Now we're back to the crux of the topic! Joins in the warehouse, or > cross-vendor joins between disparate systems? > > If multi-db _were_ done, or if multi-db connections with cross-vendor > joins via Elixir were an option, would that be a preferable route to > creating those joins between the order entry system and the > warehouse? I don't really need the warehouse for what it seems to be > used for traditionally - analytics, reports and BI - I simply need it > to make my life easier when trying to generate order and order_item > records that have foreign key attributes in different systems like > "product_id", "employee_id", "member_id", and so forth. (Although, as > I mentioned earlier, having a single repository for my OOBase users to > go to would be an added plus.) > > So maybe I shouldn't be calling what I need a "warehouse", but rather > a "central data repository with an ETL layer"? Or maybe I should > throw out the ETL overhead entirely and figure out how to perform > cross-vendor, multi-db joins? > > That is still my quandary.. > > > On Sep 26, 7:46 am, Joshua Russo <[email protected]> wrote: > > No one is recommending multi-db because it's not done yet. > > > > I'm not sure I understand exactly what you mean by "creating new order > and > > order_item data with keys to data in the warehouse" but that doesn't > sound > > like a recommended approach. > > A data warehouse (in my opinion) should be read only and no other system > > should need to know about it. An important concept to keep in mind with a > > warehouse is that it is never the originator of the base data (it may > create > > new computations of the data but it's always based on data from other > > systems). As such it is for reporting purposes only and no other system > > should treat it as the system of record. > > > > If you need to link one order item to other data in the warehouse it > should > > only either be done in the warehouse or between the systems outside of > the > > warehouse. It sounds like you want to modify the order entry application > to > > reference data in other systems and still display the data in the order > > entry app. If this is the case the only solution I would implement is > > between the two existing systems. > > > > As far as OLAP goes, it's not mutually exclusive from ER. It is simply a > > method of extracting multidimensional data. An ER model can most > certainly > > be dimensional, you just need to integrate the proper timestamp > structure. > > > > On Thu, Sep 24, 2009 at 4:35 PM, Tony Schmidt <[email protected] > >wrote: > > > > > > > > > Well, I guess the answer to the thread is looking like: > > > > > - a data warehouse is preferable to multiple db connections and cross- > > > vendor joins > > > > > But that's only because I haven't heard anyone voice support for the > > > multi-db idea. > > > > > Now the question is: > > > > > How should the warehouse be constructed in tandem with new > > > functionality and data entities? > > > > > What I've read about data warehouses seems to limit them to read-only > > > OLAP, with special data modelling (namely dimensional, as opposed to > > > the ER modelling I'm used to). But in the example I give in my first > > > post, I'm talking about creating new order and order_item data with > > > keys to data in the warehouse. Should that data be stored in the > > > warehouse, as well, so I don't have to go back to multiple db > > > connections for my joins? If it co-exists with the read-only data, > > > should it all be modelled the same? (all ER or all dimensional) Or > > > can it be mixed? > > > > > I would start a new thread, but whereas the multi-db issue related > > > directly to developing Django technology, I'm not sure the issue of > > > the warehouse design is an appropriate topic for this group (although > > > I'm also not sure where I should discuss it). > > > > > On Sep 24, 2:36 am, Joshua Russo <[email protected]> wrote: > > > > If he wants to discuss it here I can do that too. It was just a > little > > > off > > > > topic, but not not too much I suppose. > > > > > > On Thu, Sep 24, 2009 at 7:41 AM, nausikaa <[email protected]> > wrote: > > > > > > > Hi Joshua > > > > > > > Thanks for sharing your knowledge. > > > > > Too bad, I would've liked to read your suggestion to Tony as well. > ; ) > > > > > I'm a just graduate and at work I'm working on something very > similar > > > > > to what has been discussed here. > > > > > > > On Sep 23, 9:51 pm, Joshua Russo <[email protected]> wrote: > > > > > > I have many years of database design experience and it sounds > like > > > you > > > > > are > > > > > > getting lost in terminology. > > > > > > I would start out with a set of tables in the warehouse that > mirror > > > the > > > > > > existing data. In these tables you can either wipe and reload > each > > > time, > > > > > or > > > > > > create a mechanism to track each individual load. > > > > > > > > You can then simply write your application against these tables > or if > > > you > > > > > > feel adventurous you can add a more unified set of tables with > > > linking > > > > > > foreign keys and such and have your process also update the > > > cleaner/more > > > > > > unified data set. This second part get's tricky and takes > practice to > > > > > know > > > > > > you have a good structure. You will also have to track changes, > > > > > additions, > > > > > > and deletions. If you want some help or advice I would be happy > to > > > take a > > > > > > gander at your structures to show you what I might do. Feel free > to > > > email > > > > > me > > > > > > directly at [email protected]. > > > > > > > > On Wed, Sep 23, 2009 at 6:35 PM, Tony Schmidt < > > > [email protected] > > > > > >wrote: > > > > > > > > > Thanks for your reply, Joshua. > > > > > > > > > Complexity is exactly what I'm trying to avoid - but after > doing a > > > > > > > little research on data warehouses (I've never built one), that > > > seems > > > > > > > like a very complex route as well! > > > > > > > > > I would like to just come up with my "dream schema" and run > update > > > > > > > scripts on a daily or event-driven basis, but there seem to be > a > > > > > > > million concepts related to data warehousing: data marts, > > > dimensions, > > > > > > > star schemas, snowflakes, EAV tables, and so on. > > > > > > > > > It seems more straight forward if I could just connect directly > to > > > the > > > > > > > operational systems. I wonder if all that theory is really > > > necessary > > > > > > > for a basic data warehouse... > > > > > > > > > On Sep 23, 12:09 pm, Joshua Russo <[email protected]> > wrote: > > > > > > > > The multi-db branch is just now in the process of being baked > > > into > > > > > the > > > > > > > core > > > > > > > > and from what I can tell it's not quite done yet. Even when > it is > > > > > > > completely > > > > > > > > done I would recommend the data warehouse approach. I view > the > > > > > multi-db > > > > > > > > functionality more as a last resort, where you really don't > have > > > an > > > > > > > option > > > > > > > > to merge the data. It adds a lot of complexity so if you can > do > > > > > without I > > > > > > > > would recommend an alternative. > > > > > > > > Just my 2 cents > > > > > > > > Josh > > > > > > > > > > On Wed, Sep 23, 2009 at 4:05 PM, Tony Schmidt < > > > > > [email protected] > > > > > > > >wrote: > > > > > > > > > > > Hi, Nausikaa. Thanks for your reply. > > > > > > > > > > > Unfortunately, the legacy systems must remain in place > until > > > they > > > > > are > > > > > > > > > gradually (if ever) phased out. There's a whole bunch of > > > > > > > > > functionality that I don't want to have to recreate in > those > > > > > systems > > > > > > > > > (POS, inventory/accounting, products, etc.). I just want > to > > > build > > > > > new > > > > > > > > > functionality that accesses that data (like the example of > an > > > order > > > > > > > > > entry form that creates new order/item data entities in a > new > > > DB > > > > > with > > > > > > > > > keys to entities in the other DBs or to their versions in > the > > > > > > > > > warehouse). Most of it should be read only. > > > > > > > > > > > I'm starting to hear that a data warehouse is the way to go > - > > > but > > > > > then > > > > > > > > > there's the question of data warehouse vs. data mart, Inmon > vs. > > > > > > > > > Kimball, and how I can get started building one in python. > I'm > > > not > > > > > > > > > hearing many suggestions for the multi-db approach (which > makes > > > me > > > > > > > > > wonder what the Django muli-db branch is for?) > > > > > > > > > > > On Sep 22, 2:01 am, nausikaa <[email protected]> > wrote: > > > > > > > > > > Hi snfctech > > > > > > > > > > > > With warehouse I assume you mean keeping the datasources > and > > > > > periodic > > > > > > > > > > transfer into a central db (the warehouse). > > > > > > > > > > Why not migrate all your datasources into e.g. a > PostgreDQL > > > db? > > > > > > > > > > It is easy to write forms and implement logins/access > rights > > > in > > > > > > > > > > django so that your non-technical users can read or edit > the > > > > > > > > > > data. Besides you'd remove some (unnecessary) > heterogenity > > > and > > > > > > > thereby > > > > > > > > > > complexity from your system. > > > > > > > > > > But since I don't know your system I might be missing the > > > point > > > > > > > > > > completely. > > > > > > > > > > > > Nausikaa > > > > > > > > > > > > On Sep 22, 3:10 am, snfctech <[email protected]> > > > wrote: > > > > > > > > > > > > > I understand that there is a Django branch being > actively > > > > > worked on > > > > > > > > > > > for connections to multiple DB vendors, or that Django > + > > > Elixir > > > > > may > > > > > > > be > > > > > > > > > > > a good option. But I'm wondering if building a single > data > > > > > > > warehouse > > > > > > > > > > > may still be a better way to go? > > > > > > > > > > > > > Here's an example of some of the relations I'm going to > > > have to > > > > > > > build > > > > > > > > > > > for my project: > > > > > > > > > > > > > I've got order and order_item tables with their own > data > > > and > > > > > > > relations > > > > > > > > > > > to members (Access DB), products (flat file) and > employees > > > > > (MySQL). > > > > > > > > > > > > > I initially thought that the best way to manage this > would > > > be > > > > > to > > > > > > > > > > > create a new DB for the order and order_item tables, > and > > > then > > > > > > > create > > > > > > > > > > > cross-vendor joins in the ORM. But then I came across > an > > > > > > > unexpected > > > > > > > > > > > advantage of having all the data in an updated > warehouse - > > > my > > > > > semi- > > > > > > > > > > > technical staff could still use products like OOBase, > that > > > are > > > > > > > limited > > > > > > > > > > > to a single vendor connection, to make reports and > forms > > > based > > > > > on > > > > > > > the > > > > > > > > > > > warehouse data. > > > > > > > > > > > > > So now I'm wondering - are direct connections to > multiple > > > > > databases > > > > > > > > > > > really the best way to go? Or are there more > advantages to > > > > > > > building a > > > > > > > > > > > data warehouse (keeping in mind the complexities of the > > > data > > > > > > > > > > > replication, scripts for pushing and pulling data, > etc.) > > > > > > > > > > > > > Thanks in advance for any tips. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---

