[jira] [Commented] (AIRAVATA-2999) [GSoC] Administration Dashboard for Airavata Services

2019-04-02 Thread Suresh Marru (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRAVATA-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808260#comment-16808260
 ] 

Suresh Marru commented on AIRAVATA-2999:


Given the larger scope of this project, this one can probably be divided into 
two distinct GSoC projects.

> [GSoC] Administration Dashboard for Airavata Services
> -
>
> Key: AIRAVATA-2999
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2999
> Project: Airavata
>  Issue Type: New Feature
>Reporter: Dimuthu Upeksha
>Priority: Major
>  Labels: gsoc2019
>
> Typical Apache Airavata deployment consists of multiple microservices (API 
> Server, Participant, Controller, Pre Workflow Manager, Post Workflow Manager, 
> Job Monitors and etc) and several other services (Database, Kafka, RabbitMQ, 
> Keycloak, Zookeeper, Apache Helix). As it is a deployment with multiple 
> components, when it comes to an issue,  it is time consuming to find which 
> component is having the problem. So we need an Administration Dashboard which 
> can visualize the system health and provide some handle to administrators to 
> control those services like stopping or restarting each component through the 
> dashboard.
> Additionally, this dashboard should be able to authenticate users through 
> Keycloak which is the identity provider for Airavata and  only system 
> administrators should be given access to those operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRAVATA-2999) [GSoC] Administration Dashboard for Airavata Services

2019-04-02 Thread Suresh Marru (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRAVATA-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808259#comment-16808259
 ] 

Suresh Marru commented on AIRAVATA-2999:


There are more monitoring instrumentation components which can be integrated, 
Kamom [https://github.com/kamon-io/Kamon], Zipkin 
[https://github.com/openzipkin/zipkin] and Datadog - 
[https://github.com/DataDog/dd-trace-java]

Airavata previously had integrations with some of these - 
https://issues.apache.org/jira/browse/AIRAVATA-2107 

> [GSoC] Administration Dashboard for Airavata Services
> -
>
> Key: AIRAVATA-2999
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2999
> Project: Airavata
>  Issue Type: New Feature
>Reporter: Dimuthu Upeksha
>Priority: Major
>  Labels: gsoc2019
>
> Typical Apache Airavata deployment consists of multiple microservices (API 
> Server, Participant, Controller, Pre Workflow Manager, Post Workflow Manager, 
> Job Monitors and etc) and several other services (Database, Kafka, RabbitMQ, 
> Keycloak, Zookeeper, Apache Helix). As it is a deployment with multiple 
> components, when it comes to an issue,  it is time consuming to find which 
> component is having the problem. So we need an Administration Dashboard which 
> can visualize the system health and provide some handle to administrators to 
> control those services like stopping or restarting each component through the 
> dashboard.
> Additionally, this dashboard should be able to authenticate users through 
> Keycloak which is the identity provider for Airavata and  only system 
> administrators should be given access to those operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Initial GSOC Proposal draft

2019-04-02 Thread Alakh Prakash Singh Raghuvanshi
Hi,
I am interested in participating in GSOC this year and contributing to Apache 
Airavata. I have discussed the idea with Prof. Suresh and have written a basic 
proposal which highlights the outlines about the idea. The document needs 
feedback to improve. I am also open to different other project ideas if 
required.

link to the document: 
https://docs.google.com/document/d/1s5g1zepJcrkkOhJPDQZEtQgL6XTZeqpotmuQhW27YmU/edit#heading=h.ere98fb8ub87
 


Thanks,
Alakh

Re: Need help setting up Django portal

2019-04-02 Thread Suresh Marru
Hi Alakh,

This is good effort from you to debug and frame a helpful query to the mailing 
list, Good doing. I do not have insights into your issues, may be others will 
chime in.

Suresh

> On Apr 2, 2019, at 1:55 PM, Alakh Prakash Singh Raghuvanshi 
>  wrote:
> 
> Hi,
> 
> I have tried debugging this issue but I think the mysql docker container has 
> something to do with it.
> There are 2 different stack trace I get.
> 1. Database initialization fails because exp_catalog.GATEWAY table query 
> fails. But when I checked the files, create GATEWAY was the first query. I 
> thought the query has issues so I ran it in my local system but it worked 
> fine. 
> 2. Database initialization fails because 'Could not connect: Unknown database 
> ‘experiment_catalog’’. As it says could not connect, I tried going inside the 
> mysql container and I found out that  user ‘root’ was not able to login with 
> the default password given in airavata-server.properties although with user 
> ‘airavata’ I was able to login.
> 
> As to why Django was not connecting, the above db initialization doesn’t 
> happen due to which the service 
> ‘org.apache.airavata.service.profile.server.ProfileServiceServer’ supposed to 
> be running on 8962 doesn’t come up.
> Also, with python3.6 version the Django portal installation is very smooth 
> with little to no issues. I think the readme file with prerequisites should 
> be updated with the version number as well.
> 
> Initially in my local system mysql container was failing but then I deleted 
> all the project files and re-installed all the dependencies again. It worked.
> I am not sure on how to resolve the database issue.
> 
> Thanks,
> Alakh
> 
>> On 01-Apr-2019, at 4:11 PM, Suresh Marru > > wrote:
>> 
>> Most likely. You can take the hint from the previous log “Could not connect 
>> to any of [('::1', 8962, 0, 0), ('127.0.0.1', 8962)]” and search in the IDE 
>> integration code on what is supposed to run on port 8962 and then verify if 
>> it is running. This error message below says it is unable to connect to 
>> database. Can you debug that first? Most of the stack traces are English and 
>> I often find them helpful. 
>> 
>> Suresh
>> 
>>> On Apr 1, 2019, at 4:07 PM, Alakh Prakash Singh Raghuvanshi 
>>> mailto:alakh...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> Could it be because of this 
>>> 
>>> Exception in thread "main" java.lang.RuntimeException: Failed to initialize 
>>> database for database_scripts/expcatalog
>>> at 
>>> org.apache.airavata.common.utils.DBInitializer.initializeDB(DBInitializer.java:63)
>>> at 
>>> org.apache.airavata.common.utils.DBInitializer.initializeDB(DBInitializer.java:45)
>>> at 
>>> org.apache.airavata.registry.api.service.RegistryAPIServer.StartRegistryServer(RegistryAPIServer.java:69)
>>> at 
>>> org.apache.airavata.registry.api.service.RegistryAPIServer.start(RegistryAPIServer.java:151)
>>> at 
>>> org.apache.airavata.ide.integration.APIServerStarter.main(APIServerStarter.java:23)
>>> Caused by: java.sql.SQLNonTransientConnectionException: Could not connect 
>>> to address=(host=localhost)(port=13306)(type=master) : Connection refused
>>> at 
>>> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:156)
>>> at 
>>> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:118)
>>> at 
>>> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.throwException(ExceptionMapper.java:92)
>>> at org.mariadb.jdbc.Driver.connect(Driver.java:111)
>>> at java.sql.DriverManager.getConnection(DriverManager.java:664)
>>> at java.sql.DriverManager.getConnection(DriverManager.java:208)
>>> at 
>>> org.apache.airavata.common.utils.DBUtil.getConnection(DBUtil.java:212)
>>> at 
>>> org.apache.airavata.common.utils.DBInitializer.initializeDB(DBInitializer.java:54)
>>> ... 4 more
>>> Caused by: java.sql.SQLException: Could not connect to 
>>> address=(host=localhost)(port=13306)(type=master) : Connection refused
>>> at 
>>> org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1029)
>>> at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:483)
>>> at org.mariadb.jdbc.Driver.connect(Driver.java:106)
>>> ... 8 more
>>> Caused by: java.net.ConnectException: Connection refused
>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> at 
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>> at 
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>> at 
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>> at java.net.Socket.connect(Socket.java:589)
>>> at java.net.Socket.connect(Socket.java:538)
>>> at 
>>> 

Re: Need help setting up Django portal

2019-04-02 Thread Alakh Prakash Singh Raghuvanshi
Hi,

I have tried debugging this issue but I think the mysql docker container has 
something to do with it.
There are 2 different stack trace I get.
1. Database initialization fails because exp_catalog.GATEWAY table query fails. 
But when I checked the files, create GATEWAY was the first query. I thought the 
query has issues so I ran it in my local system but it worked fine. 
2. Database initialization fails because 'Could not connect: Unknown database 
‘experiment_catalog’’. As it says could not connect, I tried going inside the 
mysql container and I found out that  user ‘root’ was not able to login with 
the default password given in airavata-server.properties although with user 
‘airavata’ I was able to login.

As to why Django was not connecting, the above db initialization doesn’t happen 
due to which the service 
‘org.apache.airavata.service.profile.server.ProfileServiceServer’ supposed to 
be running on 8962 doesn’t come up.
Also, with python3.6 version the Django portal installation is very smooth with 
little to no issues. I think the readme file with prerequisites should be 
updated with the version number as well.

Initially in my local system mysql container was failing but then I deleted all 
the project files and re-installed all the dependencies again. It worked.
I am not sure on how to resolve the database issue.

Thanks,
Alakh

> On 01-Apr-2019, at 4:11 PM, Suresh Marru  wrote:
> 
> Most likely. You can take the hint from the previous log “Could not connect 
> to any of [('::1', 8962, 0, 0), ('127.0.0.1', 8962)]” and search in the IDE 
> integration code on what is supposed to run on port 8962 and then verify if 
> it is running. This error message below says it is unable to connect to 
> database. Can you debug that first? Most of the stack traces are English and 
> I often find them helpful. 
> 
> Suresh
> 
>> On Apr 1, 2019, at 4:07 PM, Alakh Prakash Singh Raghuvanshi 
>> mailto:alakh...@gmail.com>> wrote:
>> 
>> Hi,
>> 
>> Could it be because of this 
>> 
>> Exception in thread "main" java.lang.RuntimeException: Failed to initialize 
>> database for database_scripts/expcatalog
>>  at 
>> org.apache.airavata.common.utils.DBInitializer.initializeDB(DBInitializer.java:63)
>>  at 
>> org.apache.airavata.common.utils.DBInitializer.initializeDB(DBInitializer.java:45)
>>  at 
>> org.apache.airavata.registry.api.service.RegistryAPIServer.StartRegistryServer(RegistryAPIServer.java:69)
>>  at 
>> org.apache.airavata.registry.api.service.RegistryAPIServer.start(RegistryAPIServer.java:151)
>>  at 
>> org.apache.airavata.ide.integration.APIServerStarter.main(APIServerStarter.java:23)
>> Caused by: java.sql.SQLNonTransientConnectionException: Could not connect to 
>> address=(host=localhost)(port=13306)(type=master) : Connection refused
>>  at 
>> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:156)
>>  at 
>> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:118)
>>  at 
>> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.throwException(ExceptionMapper.java:92)
>>  at org.mariadb.jdbc.Driver.connect(Driver.java:111)
>>  at java.sql.DriverManager.getConnection(DriverManager.java:664)
>>  at java.sql.DriverManager.getConnection(DriverManager.java:208)
>>  at 
>> org.apache.airavata.common.utils.DBUtil.getConnection(DBUtil.java:212)
>>  at 
>> org.apache.airavata.common.utils.DBInitializer.initializeDB(DBInitializer.java:54)
>>  ... 4 more
>> Caused by: java.sql.SQLException: Could not connect to 
>> address=(host=localhost)(port=13306)(type=master) : Connection refused
>>  at 
>> org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1029)
>>  at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:483)
>>  at org.mariadb.jdbc.Driver.connect(Driver.java:106)
>>  ... 8 more
>> Caused by: java.net.ConnectException: Connection refused
>>  at java.net.PlainSocketImpl.socketConnect(Native Method)
>>  at 
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>  at 
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>  at 
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>  at java.net.Socket.connect(Socket.java:589)
>>  at java.net.Socket.connect(Socket.java:538)
>>  at 
>> org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:401)
>>  at 
>> org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1022)
>>  ... 10 more
>> 
>> I get his trace when I run 
>> org.apache.airavata.ide.integration.APIServerStarter file.
>> Thanks,
>> Alakh
>> 
>>> On 01-Apr-2019, at 4:04 PM, Alakh Prakash Singh 

Re: [GSoC] Proposal template for GSoC

2019-04-02 Thread Suresh Marru
For referencing your mailing list contributions, you can either use MarkMail 
archives (https://airavata.markmail.org/ ) or 
lists.apache.org for example links to this interactions look like this:

https://markmail.org/thread/gi4el6mc5ejv7bie 


Or 

https://lists.apache.org/thread.html/6b585fc8b44e7c3064f792c06cf79db6dda72453711e0c84301825a7@%3Cdev.airavata.apache.org%3E
 


Suresh

> On Apr 2, 2019, at 8:45 AM, Suresh Marru  wrote:
> 
> GSoC Aspirants,
> 
> Please draft your proposal in a google doc, You can copy this template and 
> fill it in:
> 
> https://docs.google.com/document/d/1I4iAagTdiSyzYZCr4_zEMtzoABzLVfUyOnC4rqo_nrI/edit?usp=sharing
> 
> At Apache Software Foundation, we will particularly require you to clearly 
> mention your “other commitments” as well as community engagement (mailing 
> list interactions, pull requests, JIRA discussions).
> 
> In addition, make sure you read the proposal guide and follow the 
> instructions closely - 
> https://google.github.io/gsocguides/student/writing-a-proposal
> 
> Suresh



Re: GSOC: Feedback for Project Proposal

2019-04-02 Thread Suresh Marru
Hi Sai Rohith,

Good you started on the draft and shared. Can you put this in the template 
format  I sent earlier today to mailing list as a google doc and give comment 
permission and share it with the list.

Suresh

> On Apr 2, 2019, at 12:17 AM, Achanta, Sai Rohith  wrote:
> 
> Hi Team,
>  
> I’m interested to participate in GSOC and wanted to contribute to Apache 
> Airavata. I have gone through some of the JIRA tickets regarding GSOC and 
> prepared a rough draft about what I’m planning to do along with time lines. 
> The document might be naïve and needs feedback to improve and to meet the 
> standards of a project proposal document for GSOC. Also let me know If I’m 
> moving towards right direction.
> I have attached the document to this mail. Please find the attachment.
>  
> I’m flexible and if you think that there is something else to which I can 
> contribute, please let know.
>  
> Thanks and regards,
> Sai Rohith Achanta.
> 



Re: ZkHelixManager disconnection hangs

2019-04-02 Thread DImuthu Upeksha
Hi Jiajun, Kishore and others,

Thanks for looking into this. We used 0.8.1 in that setup and now we
upgraded it to 0.8.2. However this was a frequently occurring issue causing
controller to manually kill and restart and we expect it to happen again if
it isn't fixed in 0.8.2. Main reason for that is we are relying on a single
Zookeper node across a fairly unreliable network for our staging
environment. We will let you know if we see that issue again. In the
meantime, I will try to reproduce it in 0.8.2 in my local helix deployment.

Thanks
Dimuthu

On Tue, Apr 2, 2019 at 1:58 AM ericwang1985  wrote:

> Could you please confirm the Helix version that is used, Dimuthu?
> The thing is that we have fixed several potential ZkHelixManager
> concurrency issues in 0.8.2. Basically, that was a race condition in which
> the disconnect method could get a disconnected non-null zkclient. In this
> case, reset handler will never finish.
>
> Please let us know if you are already using 0.8.2 or a later version. That
> probably means we have a new bug to fix.
>
> Cheers,
> -Jiajun
>
> On Apr 1, 2019, at 13:15, kishore g  wrote:
>
> This is a good catch. @Wang Jiajun  the stack
> trace is good enough to fix this right. We just have to look at all the
> paths we can get into this method and make sure resetHandler is thread safe
> and validates the state of the zkConnection and handlers.
>
> On Mon, Apr 1, 2019 at 12:41 PM Wang Jiajun 
> wrote:
>
>> Hi Dimuthu,
>>
>> Did you stop the controller when the connection is flapping or when it is
>> normal?
>> Could you please list all the steps that you have done in order?
>>
>> Best Regards,
>> Jiajun
>>
>>
>> On Sat, Mar 30, 2019 at 5:54 AM DImuthu Upeksha <
>> dimuthu.upeks...@gmail.com>
>> wrote:
>>
>> > Hi Folks,
>> >
>> > In helix controller, we have seen below log line and by looking at the
>> > code, I understood that it is due to ZkHelixManager is failing to
>> connect
>> > to zookeeper for 5 times. So I tried to stop the controller and in the
>> stop
>> > logic, we have a call to ZkHelixManager.disconnect() method and it
>> hangs. I
>> > got a thread dump and you can see where it is waiting. Can you please
>> > advice as better approach to solve this?
>> >
>> > I noticed that ZkHelixManager disconnects [1] it self when a flapping is
>> > detected. Is calling disconnect() twice the reason for that?
>> >
>> > 2019-03-29 15:19:56,832 [
>> > ZkClient-EventThread-14-api.staging.scigap.org:2181
>> ]
>> > ERROR o.a.h.m.zk.ZKHelixManager  - instanceName: helixcontroller is
>> > flapping. disconnect it.  maxDisconnectThreshold: 5 disconnects in
>> > 30ms.
>> >
>> > Thread-5 - priority:5 - threadId:0x7f5c740023f0 - nativeId:0x63f1 -
>> > nativeId (decimal):25585 - state:BLOCKED
>> > stackTrace:
>> > java.lang.Thread.State: BLOCKED (on object monitor)
>> > at
>> >
>> >
>> org.apache.helix.manager.zk.ZKHelixManager.resetHandlers(ZKHelixManager.java:903)
>> > - waiting to lock <0x0006c7e08110> (a
>> > org.apache.helix.manager.zk.ZKHelixManager)
>> > at
>> >
>> >
>> org.apache.helix.manager.zk.ZKHelixManager.disconnect(ZKHelixManager.java:693)
>> > at
>> >
>> >
>> org.apache.airavata.helix.impl.controller.HelixController.disconnect(HelixController.java:103)
>> > at
>> >
>> >
>> org.apache.airavata.helix.impl.controller.HelixController$$Lambda$2/846492085.run(Unknown
>> > Source)
>> > at java.lang.Thread.run(Thread.java:748)
>> > Locked ownable synchronizers:
>> > - None
>> >
>> > [1]
>> >
>> >
>> https://github.com/apache/helix/blob/helix-0.8.2/helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixManager.java#L991
>> > Thanks
>> > Dimuthu
>> >
>>
>
>


[GSoC] Proposal template for GSoC

2019-04-02 Thread Suresh Marru
GSoC Aspirants,

Please draft your proposal in a google doc, You can copy this template and fill 
it in:

https://docs.google.com/document/d/1I4iAagTdiSyzYZCr4_zEMtzoABzLVfUyOnC4rqo_nrI/edit?usp=sharing

At Apache Software Foundation, we will particularly require you to clearly 
mention your “other commitments” as well as community engagement (mailing list 
interactions, pull requests, JIRA discussions).

In addition, make sure you read the proposal guide and follow the instructions 
closely - https://google.github.io/gsocguides/student/writing-a-proposal

Suresh