Re: [Architecture] Configuring transport and security policy in dataservice config
Hi, Earlier I had the idea that ESB is also doing the same thing, and thought it is easier, where it needed lesser properties, and didn't really think the policy could be re-used by other services, but later I also got to know, that is not the case. So yeah, in that case, lets have another property like policyKey or policyPath to give the path to policy. Cheers, Anjana. On Thu, Aug 21, 2014 at 9:01 AM, Selvaratnam Uthaiyashankar shan...@wso2.com wrote: Why do you prefer convention over explicit policy location. For example, I have Data service 1, 2, 3. Data service 1 and 2 are using policy 1. Data service 3 is using policy 2. With using convention, either you can have 1 policy or 3 policy for above case. You will not be able to have only 2 policy. On Wednesday, August 20, 2014, Anjana Fernando anj...@wso2.com wrote: Hi Chanika, Lets just put enableSec as an attribute in the root element of the data service configuration. Like, data enableSec=true .. and as for the policy file location, I guess there is a standard location the ESB would look up if its not given explicitly, we will also just skip the policy location attribute and just go by convention where the policy file would be located. Cheers, Anjana. On Wed, Aug 20, 2014 at 2:17 PM, Chanika Geeganage chan...@wso2.com wrote: Hi, We recently came across a requirement to support QoS related configurations to .dbs file itself rather than adding a separate services.xml file. Therefore we are going to add the transport and security policy related configurations in the same way that in ESB proxy services configurations. The changes are: 1. Adding transports=https http attribute to configure transport info 2. Adding enableSec tag with the policy key to configure security i.e: policy key=path/to/policy/ enableSec/ In the deployment time these configurations will be extracted. Will this be a good approach to follow? Thanks -- Best Regards.. Chanika Geeganage Software Engineer WSO2, Inc.; http://wso2.com -- *Anjana Fernando* Senior Technical Lead WSO2 Inc. | http://wso2.com lean . enterprise . middleware -- S.Uthaiyashankar VP Engineering WSO2 Inc. http://wso2.com/ - lean . enterprise . middleware Phone: +94 714897591 ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- *Anjana Fernando* Senior Technical Lead WSO2 Inc. | http://wso2.com lean . enterprise . middleware ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Re: [Architecture] [AF] Datasources for PHP application type
Any thoughts please? On Tue, Aug 19, 2014 at 8:14 PM, Madhawa Bandara madh...@wso2.com wrote: Hi, Appfactory supports data sources to be defined and be used in the Java applications. In the process of enabling the PHP app type support in Appfactory, we need to allow users(i.e.developers) to create data sources in Appfactory and use them directly inside their PHP applications. PHP applications use the odbc_connect ( string $dsn , string $user , string $password [, int $cursor_type ] ) to connect to a database. There are third party libraries that enable Java inside PHP scripts [1]. An example for JNDI look-up in inside PHP is in [2]. In order to allow data sources to be called directly from the PHP apps what are the preferable options available? You ideas are welcome. [1] - http://php-java-bridge.sourceforge.net/pjb/ [2] - http://php-java-bridge.sourceforge.net/pjb/examples/source.php?source=documentClient.php -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726* Blog* - *classdeffound.blogspot.com ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Re: [Architecture] [AF] Datasources for PHP application type
Hi Madhawa, We can keep these variables(string $dsn , string $user , string $password) in registry and use registry rest API to get values at runtime. So when you promote the application to Test and Production environments, application will pick the environment specific values. This will not break PHP developer experience as well. thank you. On Thu, Aug 21, 2014 at 7:00 PM, Madhawa Bandara madh...@wso2.com wrote: Any thoughts please? On Tue, Aug 19, 2014 at 8:14 PM, Madhawa Bandara madh...@wso2.com wrote: Hi, Appfactory supports data sources to be defined and be used in the Java applications. In the process of enabling the PHP app type support in Appfactory, we need to allow users(i.e.developers) to create data sources in Appfactory and use them directly inside their PHP applications. PHP applications use the odbc_connect ( string $dsn , string $user , string $password [, int $cursor_type ] ) to connect to a database. There are third party libraries that enable Java inside PHP scripts [1]. An example for JNDI look-up in inside PHP is in [2]. In order to allow data sources to be called directly from the PHP apps what are the preferable options available? You ideas are welcome. [1] - http://php-java-bridge.sourceforge.net/pjb/ [2] - http://php-java-bridge.sourceforge.net/pjb/examples/source.php?source=documentClient.php -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Manjula Rathnayaka Software Engineer WSO2, Inc. Mobile:+94 77 743 1987 ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Re: [Architecture] [AF] Datasources for PHP application type
Hi Madhawa. We can keep these variables(string $dsn , string $user , string $password) in registry and use registry rest API to get values at runtime. So when you promote the application to Test and Production environments, application will pick the environment specific values. This will not break PHP developer experience as well. +1 for above answer. Moreover AFAIK we can not directly run piece of java code inside php script as php is a interpreted language. Cleanest way would be expose a web-service but there is a ugly way you can execute a jar file from php script using shell_exec() [1]. [1] http://php.net/manual/en/function.shell-exec.php Regards Roshan. -- Roshan Wijesena. Senior Software Engineer-WSO2 Inc. Mobile: *+94752126789* Email: ros...@wso2.com *WSO2, Inc. :** wso2.com http://wso2.com/* lean.enterprise.middleware. ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
[Architecture] [Cloud] Tenant deletion
Hi Everyone, We are working on the Training Project -[Cloud] Tenant deletion code/script for cloud - https://redmine.wso2.com/issues/3121. Listed below the workflow of the tenant deletion process in the App Cloud as we identified. 1. Undeploy Jenkins web app from application server 2. Delete Git repository (use gitblit api to delete repo in Git) 3. Unsubscribe Stratos using Stratos Rest Services 4. Check database created by RSSAdmin and delete them 5. Perform TenantMgtAdminService deleteTenant operation - i. Delete Billing data ii. Delete Tenant Registration Data (Ex. REG_CLUSTER_LOCK, REG_LOG) iii. Delete Tenant User management data (Ex. UM_USER_PERMISSION, UM_USER) iv. Remove Tenant information from cache v. Delete UM_TENANT table As per the analysis there are two solutions we have identified to implement this , such as BPEL and Carbon Component. We thought of going for a *carbon component* implementation rather than using a* BPEL* due to following reasons. 1. Plugging a Carbon Component will give more extensibility to implement Tenant Deletion operation in future Cloud base products 2. If we used a BPEL we will have to reconstruct at each time when we meet a new requirement (ex: esb cloud integration). Proposed Solution 1. Create an abstraction for delete operation public interface TenantDeletion{ public void onDeletion(); } 2. Implement TenantDeletion for each operations public class JenkinsAppUndeployer implements TenantDeletion{ public void onDeletion(){ //Implementation of the JenkinsApp undeploy process } } 3. Use a configuration file to maintain the execution order which help to dynamically add new requirement ExecutionOrder class name=”org.wso2.cloud.tenant.JenkinsAppUndeployer”/class class name=”org.wso2.cloud.tenant.GitRepoRemover”/class class name=”org.wso2.cloud.tenant.XX”/class /ExecutionOrder We are looking for a feedback on this to move forward with selected design. -- Mahesh Chinthaka Software Engineer , WSO2. Phone : (+94) 71 63 63 083 Email : mahe...@wso2.com ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Re: [Architecture] [Cloud] Tenant deletion
On Thu, Aug 21, 2014 at 8:24 PM, Mahesh Chinthaka mahe...@wso2.com wrote: Hi Everyone, We are working on the Training Project -[Cloud] Tenant deletion code/script for cloud - https://redmine.wso2.com/issues/3121. Listed below the workflow of the tenant deletion process in the App Cloud as we identified. 1. Undeploy Jenkins web app from application server 2. Delete Git repository (use gitblit api to delete repo in Git) 3. Unsubscribe Stratos using Stratos Rest Services 4. Check database created by RSSAdmin and delete them 5. Perform TenantMgtAdminService deleteTenant operation - i. Delete Billing data ii. Delete Tenant Registration Data (Ex. REG_CLUSTER_LOCK, REG_LOG) iii. Delete Tenant User management data (Ex. UM_USER_PERMISSION, UM_USER) iv. Remove Tenant information from cache v. Delete UM_TENANT table Don't you need to cleanup issue tracker? As per the analysis there are two solutions we have identified to implement this , such as BPEL and Carbon Component. We thought of going for a *carbon component* implementation rather than using a* BPEL* due to following reasons. 1. Plugging a Carbon Component will give more extensibility to implement Tenant Deletion operation in future Cloud base products 2. If we used a BPEL we will have to reconstruct at each time when we meet a new requirement (ex: esb cloud integration). Proposed Solution Why can't you use existing TenantMgtListener and add onDelete method.It also has ListenerOrder and every implementation should be registered as OSGI service. 1. Create an abstraction for delete operation public interface TenantDeletion{ public void onDeletion(); } 2. Implement TenantDeletion for each operations public class JenkinsAppUndeployer implements TenantDeletion{ public void onDeletion(){ //Implementation of the JenkinsApp undeploy process } } 3. Use a configuration file to maintain the execution order which help to dynamically add new requirement ExecutionOrder class name=”org.wso2.cloud.tenant.JenkinsAppUndeployer”/class class name=”org.wso2.cloud.tenant.GitRepoRemover”/class class name=”org.wso2.cloud.tenant.XX”/class /ExecutionOrder We are looking for a feedback on this to move forward with selected design. -- Mahesh Chinthaka Software Engineer , WSO2. Phone : (+94) 71 63 63 083 Email : mahe...@wso2.com ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- ajanthan -- Ajanthan Balachandiran Senior Software Engineer; Solutions Technologies Team ;WSO2, Inc.; http://wso2.com/ email: ajanthan http://goog_595075977@wso2.com; cell: +94775581497 blog: http://bkayts.blogspot.com/ Lean . Enterprise . Middleware ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Re: [Architecture] [AF] Datasources for PHP application type
On Fri, Aug 22, 2014 at 9:53 AM, Manjula Rathnayake manju...@wso2.com wrote: Hi all, On Fri, Aug 22, 2014 at 8:41 AM, Dimuthu Leelarathne dimut...@wso2.com wrote: Hi Madhawa, Does PHP have native datasource support? For example[1]. I am -1 on doing it through Java. We must look at how PHP community does it. First thing is to see how PHP community uses databases in apps. If they do have a native datasource concept we have to use it. If that is not available second option is using variables and calling registry via REST APIs. +1, And AFAIK, web developers keep these variables in a configuration file. This is because they have externalized all the variables which needs to be replaced when they need to deploy in a new environment. If we provide a mechanism to upload a complete configuration instead of property by property, it will make the developer life easier. Here there are some concerns, how the user going to manage credential for calling the REST api? Are we recommending to use config file inside source tree with encrypted password? Then there is a problem in sharing the private key between user and the server. Mutual ssl also has some limitation.If the user happen to know the admin username he can set it in the header and do operation as admin. thank you. thanks, dimuthu [1] http://book.cakephp.org/2.0/en/models/datasources.html On Thu, Aug 21, 2014 at 7:17 PM, Manjula Rathnayake manju...@wso2.com wrote: Hi Madhawa, We can keep these variables(string $dsn , string $user , string $password) in registry and use registry rest API to get values at runtime. So when you promote the application to Test and Production environments, application will pick the environment specific values. This will not break PHP developer experience as well. thank you. On Thu, Aug 21, 2014 at 7:00 PM, Madhawa Bandara madh...@wso2.com wrote: Any thoughts please? On Tue, Aug 19, 2014 at 8:14 PM, Madhawa Bandara madh...@wso2.com wrote: Hi, Appfactory supports data sources to be defined and be used in the Java applications. In the process of enabling the PHP app type support in Appfactory, we need to allow users(i.e.developers) to create data sources in Appfactory and use them directly inside their PHP applications. PHP applications use the odbc_connect ( string $dsn , string $user , string $password [, int $cursor_type ] ) to connect to a database. There are third party libraries that enable Java inside PHP scripts [1]. An example for JNDI look-up in inside PHP is in [2]. In order to allow data sources to be called directly from the PHP apps what are the preferable options available? You ideas are welcome. [1] - http://php-java-bridge.sourceforge.net/pjb/ [2] - http://php-java-bridge.sourceforge.net/pjb/examples/source.php?source=documentClient.php -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Manjula Rathnayaka Software Engineer WSO2, Inc. Mobile:+94 77 743 1987 ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Dimuthu Leelarathne Architect Product Lead of App Factory WSO2, Inc. (http://wso2.com) email: dimut...@wso2.com Mobile : 0773661935 Lean . Enterprise . Middleware ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Manjula Rathnayaka Software Engineer WSO2, Inc. Mobile:+94 77 743 1987 ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- ajanthan -- Ajanthan Balachandiran Senior Software Engineer; Solutions Technologies Team ;WSO2, Inc.; http://wso2.com/ email: ajanthan http://goog_595075977@wso2.com; cell: +94775581497 blog: http://bkayts.blogspot.com/ Lean . Enterprise . Middleware ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Re: [Architecture] [AF] Datasources for PHP application type
Hi Ajanthan, On Fri, Aug 22, 2014 at 10:08 AM, Ajanthan Balachandran ajant...@wso2.com wrote: On Fri, Aug 22, 2014 at 9:53 AM, Manjula Rathnayake manju...@wso2.com wrote: Hi all, On Fri, Aug 22, 2014 at 8:41 AM, Dimuthu Leelarathne dimut...@wso2.com wrote: Hi Madhawa, Does PHP have native datasource support? For example[1]. I am -1 on doing it through Java. We must look at how PHP community does it. First thing is to see how PHP community uses databases in apps. If they do have a native datasource concept we have to use it. If that is not available second option is using variables and calling registry via REST APIs. +1, And AFAIK, web developers keep these variables in a configuration file. This is because they have externalized all the variables which needs to be replaced when they need to deploy in a new environment. If we provide a mechanism to upload a complete configuration instead of property by property, it will make the developer life easier. Here there are some concerns, how the user going to manage credential for calling the REST api? Are we recommending to use config file inside source tree with encrypted password? Then there is a problem in sharing the private key between user and the server. Mutual ssl also has some limitation.If the user happen to know the admin username he can set it in the header and do operation as admin. Good point. We have to go with OAuth based solution, This is REST API security. We can expose these REST API via API Manager too. thank you. thank you. thanks, dimuthu [1] http://book.cakephp.org/2.0/en/models/datasources.html On Thu, Aug 21, 2014 at 7:17 PM, Manjula Rathnayake manju...@wso2.com wrote: Hi Madhawa, We can keep these variables(string $dsn , string $user , string $password) in registry and use registry rest API to get values at runtime. So when you promote the application to Test and Production environments, application will pick the environment specific values. This will not break PHP developer experience as well. thank you. On Thu, Aug 21, 2014 at 7:00 PM, Madhawa Bandara madh...@wso2.com wrote: Any thoughts please? On Tue, Aug 19, 2014 at 8:14 PM, Madhawa Bandara madh...@wso2.com wrote: Hi, Appfactory supports data sources to be defined and be used in the Java applications. In the process of enabling the PHP app type support in Appfactory, we need to allow users(i.e.developers) to create data sources in Appfactory and use them directly inside their PHP applications. PHP applications use the odbc_connect ( string $dsn , string $user , string $password [, int $cursor_type ] ) to connect to a database. There are third party libraries that enable Java inside PHP scripts [1]. An example for JNDI look-up in inside PHP is in [2]. In order to allow data sources to be called directly from the PHP apps what are the preferable options available? You ideas are welcome. [1] - http://php-java-bridge.sourceforge.net/pjb/ [2] - http://php-java-bridge.sourceforge.net/pjb/examples/source.php?source=documentClient.php -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com -- Regards, *Madhawa Bandara* Software Engineer WSO2, Inc. lean.enterprise.middleware Mobile - *+94777487726 %2B94777487726* Blog* - *classdeffound.blogspot.com ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Manjula Rathnayaka Software Engineer WSO2, Inc. Mobile:+94 77 743 1987 ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Dimuthu Leelarathne Architect Product Lead of App Factory WSO2, Inc. (http://wso2.com) email: dimut...@wso2.com Mobile : 0773661935 Lean . Enterprise . Middleware ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Manjula Rathnayaka Software Engineer WSO2, Inc. Mobile:+94 77 743 1987 ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- ajanthan -- Ajanthan Balachandiran Senior Software Engineer; Solutions Technologies Team ;WSO2, Inc.; http://wso2.com/ email: ajanthan http://goog_595075977@wso2.com; cell: +94775581497 blog: http://bkayts.blogspot.com/ Lean . Enterprise . Middleware ___ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture -- Manjula Rathnayaka Software Engineer WSO2, Inc. Mobile:+94 77 743 1987
Re: [Architecture] [Cloud] Tenant deletion
Hi +1 for the OnPreDelete concept. But the thing is we don't have this Pre and Post events anywhere in platform. I think that's something we should consider about. WDYT? Thanks Regards Danushka Fernando Software Engineer WSO2 inc. http://wso2.com/ Mobile : +94716332729 On Fri, Aug 22, 2014 at 9:14 AM, Dimuthu Leelarathne dimut...@wso2.com wrote: Hi Mahesh all, Lets consider Carbon Platform aspect first. Before we remove tenant from user core and registry, we have to delete it from all other places. So +1 for the interface that would allow different product teams to clean up their cleanup process, but here is what I recommend, We need a method call onPreDelete() on TenantMgtListener. This is to allow all product teams to implement it. So the first rule of thumb is, if any product is moving to cloud they must implement this method and prove that they clean up the tenant before they move to WSO2Cloud. So basically, in tenant.core what you have to do is call all OSGi registered TenantMgtListener's onPreDelete(), and after that delete from registry and finally user.core. That would be the most elegant and extensible fix for platform. Now we come to AF as a product/solution. We have to implement onPreDelete() method. So we as a product team should decide whether we are going to implement it from BPEL or not. So as a product in order to be WSO2Cloud friendly we have to implement onPreDelete() method. From what I feel, I believe the way to do is code + BPEL. thanks, dimuthu On Fri, Aug 22, 2014 at 7:26 AM, Ajanthan Balachandran ajant...@wso2.com wrote: On Fri, Aug 22, 2014 at 5:48 AM, Danushka Fernando danush...@wso2.com wrote: Hi Ajanthan Problem with OnDelete is it is called after tenant deleted (After deleting userstore and registry). But we need to cleanup before that otherwise we cannot call admin services since tenant is not there. As per I mentioned in the previous thread we need to call this at a OnPreDelete. IMO OnDelete method should be called as first step. @Mahesh : I think you have missed delete applications step. And delete applications step would Issue tracker details as well I guess. @ Ajanthan : Correct me if I am wrong. Looping through each applications and deleting will not be a salable solution. Thanks Regards Danushka Fernando Software Engineer WSO2 inc. http://wso2.com/ Mobile : +94716332729 On Thu, Aug 21, 2014 at 8:46 PM, Ajanthan Balachandran ajant...@wso2.com wrote: On Thu, Aug 21, 2014 at 8:24 PM, Mahesh Chinthaka mahe...@wso2.com wrote: Hi Everyone, We are working on the Training Project -[Cloud] Tenant deletion code/script for cloud - https://redmine.wso2.com/issues/3121. Listed below the workflow of the tenant deletion process in the App Cloud as we identified. 1. Undeploy Jenkins web app from application server 2. Delete Git repository (use gitblit api to delete repo in Git) 3. Unsubscribe Stratos using Stratos Rest Services 4. Check database created by RSSAdmin and delete them 5. Perform TenantMgtAdminService deleteTenant operation - i. Delete Billing data ii. Delete Tenant Registration Data (Ex. REG_CLUSTER_LOCK, REG_LOG) iii. Delete Tenant User management data (Ex. UM_USER_PERMISSION, UM_USER) iv. Remove Tenant information from cache v. Delete UM_TENANT table Don't you need to cleanup issue tracker? As per the analysis there are two solutions we have identified to implement this , such as BPEL and Carbon Component. We thought of going for a *carbon component* implementation rather than using a* BPEL* due to following reasons. 1. Plugging a Carbon Component will give more extensibility to implement Tenant Deletion operation in future Cloud base products 2. If we used a BPEL we will have to reconstruct at each time when we meet a new requirement (ex: esb cloud integration). Proposed Solution Why can't you use existing TenantMgtListener and add onDelete method.It also has ListenerOrder and every implementation should be registered as OSGI service. 1. Create an abstraction for delete operation public interface TenantDeletion{ public void onDeletion(); } 2. Implement TenantDeletion for each operations public class JenkinsAppUndeployer implements TenantDeletion{ public void onDeletion(){ //Implementation of the JenkinsApp undeploy process } } 3. Use a configuration file to maintain the execution order which help to dynamically add new requirement ExecutionOrder class name=”org.wso2.cloud.tenant.JenkinsAppUndeployer”/class class name=”org.wso2.cloud.tenant.GitRepoRemover”/class class name=”org.wso2.cloud.tenant.XX”/class /ExecutionOrder We are looking for a feedback on this to move forward with selected design. -- Mahesh Chinthaka Software Engineer , WSO2. Phone : (+94) 71 63 63 083 Email : mahe...@wso2.com ___ Architecture mailing
Re: [Architecture] [POC] Performance evaluation of Hive vs Shark
Hi Srinath, Yes, I am working on deploying it on a multi-node cluster with the debs dataset. I will keep architecture@ posted on the progress. Hi David, Thank you very much for the detailed insight you've provided. Few quick questions, 1. Do you have experiences in using storage handlers in Spark? 2. Would a storage handler used in Hive, be directly compatible with Spark? 3. How do you grade the performance of Spark with other databases such as Cassandra, HBase, H2, etc? Thank you very much again for your interest. Look forward to hearing from you. Regards On Thu, Aug 21, 2014 at 7:02 PM, Srinath Perera srin...@wso2.com wrote: Niranda, we need test Spark in multi-node mode before making a decision. Spark is very fast, I think there is no doubt about that. We need to make sure it stable. David, thanks for a detailed email! How big (nodes) is the Spark setup you guys are running? --Srinath On Thu, Aug 21, 2014 at 1:34 PM, David Morales dmora...@stratio.com wrote: Sorry for disturbing this thread, but i think that i can help clarifying a few things (we were attending the last Spark Summit, we were also speakers there and we are working very close to spark) * Hive/Shark and others benchmark* You can find a nice comparison and benchmark in this web: https://amplab.cs.berkeley.edu/benchmark/ * Shark and SparkSQL* SparkSQL is the natural replacement for Shark, but SparkSQL is still young at this moment. If you are looking for Hive compatibility, you have to execute SparkSQL with an specific context. Quoted from spark website: * Note that Spark SQL currently uses a very basic SQL parser. Users that want a more complete dialect of SQL should look at the HiveSQL support provided by HiveContext.* So, only note that SparkSQL is a work in progress. If you want SparkSQL you have to run a SparkSQLContext, if you want Hive, you will have a different context... * Spark - Hadoop: the future* Most Hadoop distributions are including Spark: cloudera, hortonworks, mapR... and contributing to migrate all the Hadoop ecosystem to Spark. Spark is a bit more than Map/Reduce... as you can read here: http://gigaom.com/2014/06/28/4-reasons-why-spark-could-jolt-hadoop-into-hyperdrive/ * Spark Streaming / Spark SQL* Spark Streaming is built on Spark and it provides streaming processing through an information abstraction called DStreams (a collection of RDDs in a window of time). There is some efforts in order to make SparkSQL compatible with Spark Streaming (something similar to trident for storm), as you can see here: *StreamSQL (https://github.com/thunderain-project/StreamSQL https://github.com/thunderain-project/StreamSQL) is a POC project based on Spark to combine the power of Catalyst and Spark Streaming, to offer people the ability to manipulate SQL on top of DStream as you wanted, this keep the same semantics with SparkSQL as offer a SchemaDStream on top of DStream. You don't need to do tricky thing like extracting rdd to register as a table. Besides other parts are the same as Spark.* So, you can apply a SQL in a data stream, but it is very simple at the moment... you can expect a bunch of improvements in this matter in the next months (i guess that sparkSQL will work on Spark streaming streams before the end of this year). * Spark Streaming / Spark SQL and CEP* There is no relationship at this moment between (your absolutely amazing) Siddhi CEP and Spark. As fas as i know, you are working in doing distributed CEP with Storm and Siddhi. We are currently working on doing an interactive cep built with kafka + spark streaming + siddhi, with some features such as an API, an interactive shell, built-in statistics and auditing, built-in functions (save2cassandra, save2mongo, save2elasticsearch...). If you are interested we can talk about this project, i think that it would be a nice idea¡ Anyway, i don't think that SparkSQL will evolve in something like a CEP. Patterns, sequences, for example would be very complex to do with spark streaming (at least now). Thanks. 2014-08-21 6:18 GMT+02:00 Sriskandarajah Suhothayan s...@wso2.com: On Wed, Aug 20, 2014 at 1:36 PM, Niranda Perera nira...@wso2.com wrote: @Maninda, +1 for suggesting Spark SQL. Quote Databricks, Spark SQL provides state-of-the-art SQL performance and maintains compatibility with Shark/Hive. In particular, like Shark, Spark SQL supports all existing Hive data formats, user-defined functions (UDF), and the Hive metastore. [1] But I am not entirely sure if Spark SQL and Siddhi is comparable, because SparkSQL (like Hive) is designed for batch processing, where as Siddhi is real-time processing. But if there are implementations where Siddhi is run on top of Spark, it would be very interesting. Yes Siddhi's current way of operation does not support this. But with partitions and we can achieve this to some extent. Suho Spark supports