RE: SQL DB Integration

2012-01-29 Thread Krassimir Kostov

Hi Viktor,
 
Thanks for the comments.  True, the characteristics that I outlined were 
general, just to give a background/context to the problem I’m trying to solve.  
Will address more specific questions when it comes to designing and 
implementing the data storage solution and the API to do the integration of (1) 
– (3) above.
 
Given that our data mining application (IBM SPSS Modeler), our partner platform 
(Oracle DB data model), used for additional services and our clients’ DBs are 
all based on SQL, from your experience:
 
(1) Is it a good idea to use Cassandra as a storage solution for SQL data, 
converted to the NoSQL data model just to be stored on Cassandra?
(2) Do you know of any similar cases of using Cassandra as a storage, 
supporting SQL data applications, or perhaps data model architecture 
differences and high development costs make no sense for this?
(3) If using Cassandra as a storage, supporting SQL data applications is not a 
good idea, do you recommend an alternative SQL cloud DB solution that has good 
scalability? 
 
Thanks and regards,
 
Krassimir Kostov  

SQL DB Integration

2012-01-27 Thread Krassimir Kostov

Hello! 

I am working on a project, for which I have to evaluate and recommend the 
implementation of a new database system, with the following major 
characteristics: 

* Operational scalability 
* Low cost 
* Ability to serve both as a data storage facility and an advanced data 
manipulation tool 
* Speed of execution 
* Real-time writing capability, with potential to record millions of client 
data records in real time 
* Flexibility: ability to support all client data types and formats, structured 
and unstructured 
* Capability to support multiple data centers and geographies 
* Ability to provide data infrastructure solutions for clients with small and 
Big Data needs 
* Full and flawless integration with the following 3 infrastructures: 

  (1) A data mining application (IBM SPSS Modeler) that imports/exports data 
from/to an SQL database 
  (2) A partner platform, based on an Oracle Database (CSV data import/export) 
  (3) Various client SQL databases, whose data elements will be uploaded and 
replicated in the recommended database system 

As a result to my research, I am planning to recommend the implementation of 
Apache Cassandra NoSQL DB, hosted on Amazon Elastic Compute Cloud (Amazon EC2). 
I realize that the biggest challenge from the above 3 points is probably the 
last one, since for each client we need to custom-build and replicate their 
database, changing the data model from SQL to NoSQL. The reason being that (1) 
and (2) relate only to transferring data up and down between SQL and NoSQL 
environments.

My question is how easy/difficult is it to build a GUI/API that will be able to 
do the integration in the above 3 points with respect to transferring data 
(upstream / downstream) between the Cassandra NoSQL NoSQL environments? Do you 
have any other comments or suggestions that I should consider? 

Thanks a lot for your involvement and have a great day! 

Sincerely, 

Krassimir Kostov