Hi Apoorv, For all coding suggestions, I always suggest you fork the Airavata sandbox repository and submit a pull request from your repo. That way you have a provenance of your contributions to a major open source foundation. More over a PR is easier to review and provide feedback instead of a repo.
This is great work through. I hope Shamaeera can review and provide feedback, he has been the most experienced on this topic and its associated pragmatic issues. Suresh > On Jul 17, 2017, at 11:30 AM, Apoorv Palkar <[email protected]> wrote: > > Hey Dev, > > For the past 3-3.5 weeks, I've been investigating the use of Helix in > Airavata and been working on the email monitoring problem. I went through the > Curator/Zookeeper code to test out the internal workings of Helix. A > particular question I had was, what is the difference between external view > and current state? I understood that helix uses the resource model to > maintain both the ideal state and current state. Why is it necessary to have > an external view? In addition to this, what is the purpose of a spectator > node. In the documentation, it states that a "spectator" reacts to changes in > a distributed system. Why have the particular node have limited abilities > when you can give it full access? These questions may be highly important to > consider when writing the Helix paper for submission. As for the > mailing/monitoring system, I have decided to move forward with the JavaMail > API + IMAP implementation. I used the [email protected] (gmail) address as > a basis for running my test code. For this particular use case, I didn't use > the Gmail API because it had limited capabilities in terms of > function/library uses. I played around with the Gmail API, however, I was > unsuccessful in getting it to work in a clean and efficient manner. As such, > I decided to use the JavaMail api provided via imported libraries. IMAP was > considered because it had greater capabilities than POP3. POP3 was > inefficient when fetching the emails. In terms of first reading the emails, > the first challenge was to set up the code correctly to read from Gmail. > Previously the issue was that the emails were being read every time the > read() function was called in the Inbox class. This meant that every message > would be pulled even if one email was unread. This proved to be highly time > costly as the scigap email address has 10000+ emails at any given time. I set > up boolean flags for email addresses that were read and ones that were > unread. As a result, all messages don't have to be pulled; only the ones with > a "false" flag need to be read. These messages were pulled and then put into > a Message[] array. This array was then compared using lambda expression as > JavaMail retrieves the most current message last. After these messages are > put into the array and dealt with, the messages are marked as "read" to avoid > reading them again. Currently, I'm working on improving the implementations > of all four email parsers. It is highly important to make sure these parsers > run effeciently as many emails would be read. I didn't want to use regex as > it is slightly slower than string operations. For my demo code, I have > currently used string operations to parse the subject title/content. In > reality, an array or StringBuilder class shoulder be used when implemented > professionally to improve on speed. Currently, I'm refactoring the PBS code > to run a bit more optimally and run test cases for the other two email types. > Below is a link for the gmail implementation + SLURM interpreter. Basically > the idea is to have 4 classes that handle each type and then proceed to parse > the messages from the Message[] array. The idea is to then take this COMMON > data collected such as job_id, name, status, time and then put it into a > thrift data model file. Using this thrift, then create a java thrift object > to send over a AMPQ message queue, RabbitMQ, to then potentially be used in a > MySQL/SQL database. As of now, the database part is not clear, but it would > most likely a registery that needs to be updated via use of Java JPA > libary/SQL queries. > > https://github.com/chessman179/gmailtestinged <<<<<<<<<<<<< > code. > > > ** big shout out to Marcus --
