Hi Oscar, Thanks for your suggestions. Yes there should be a 1..n relationship between EmailContact > Email (I have missed that in my diagram). I agree with you that EmailContact-Reputation relationship can be derived from the Email's sender attribute over the emails the contact has sent.
I was also thinking EmailContact should have a link to a Reputation entity to represent the Reputation profile of the particular contact. This Reputation entity contains a normalized reputation-score of all the emails exchanged with the EmailContact. Do you think this relationship should not be explicitly defined? Here is the edited class diagram incorporating your suggestions: http://yuml.me/edit/086f250a More suggestions are welcome. Thanks, Dileepa On Wed, Mar 26, 2014 at 2:43 AM, GESCONSULTOR - Óscar Bou < [email protected]> wrote: > Mmmm... seems not properly pasted it. > > Try this one: > > http://yuml.me/7801f5db > > Or, if also considering the EmailContact relationship with UserInbox a > derived one (the same EmailContact could have Emails on different > UserInbox'es): > > http://yuml.me/2359d93b > > > Just to question the model :-) > > Regards, > > Oscar > > > El 25/03/2014, a las 22:00, GESCONSULTOR - Óscar Bou < > [email protected]> escribió: > > > Hi, Dileepa. > > > > Just some questions for helping in validating the model. > > > > Why not a variation like this? > > http://yuml.me/edit/825d7db5 > > > > Still not clear to me why the Reputation entity has a relationship with > EmailContact also, and not only to an Email. > > The EmailContact relationship could always be derived from the Emails > sender (EmailContact) so, unless you're explicitly modeling that derived > relationship, it shouldn't appear. > > > > HTH, > > > > Oscar > > > > > > > > > > El 25/03/2014, a las 21:20, Dileepa Jayakody <[email protected]> > escribió: > > > >> Hi Dan and all, > >> > >> Here is the basic class diagram for the domain entitiies in RB : > >> http://yuml.me/825d7db5 > >> > >> Please note that I have used the name EmailContact instead of > >> EmailSenderProfile for clarity purpose. Effectively this entity > represents > >> the email contacts in the user's inbox. > >> > >> Each email and email contact will have a corresponding Reputation > entity. > >> And in the view models, EmailReputationViewModel will display emails > with > >> their reputation data and ContactReputationViewModel will display email > >> contacts with their reputation data in the RB web application. > >> > >> Your ideas and suggestions are most welcome. > >> > >> Thanks, > >> Dileepa > >> > >> > >> On Tue, Mar 25, 2014 at 3:42 PM, Dileepa Jayakody < > [email protected] > >>> wrote: > >> > >>> Hi Dan, > >>> > >>> Thanks a lot for your insight. Please see my comments inline below. > >>> > >>> > >>> On Tue, Mar 25, 2014 at 1:21 PM, Dan Haywood < > [email protected] > >>>> wrote: > >>> > >>>> Hi Dileepa, > >>>> > >>>> I've just posted the comments below on your GSOC proposal. I know > that > >>>> you can't make further changes to the proposal, so I'm posting them > here on > >>>> the dev list, so we can keep the conversation going. > >>>> > >>>> So.. > >>>> > >>>> * good to see you intend to set up a project on github for this; > please > >>>> do this asap. That way you can start to capture docs/working notes. > I > >>>> also suggest that you set up github pages for your site [1]. > >>>> > >>> > >>>> * What I'd like to see right now is some sort of UML diagram; you > could > >>>> sketch one using yuml.me [2] and add it to your github site. I can't > >>>> quite work out how the persistent domain entities relate to each > other. In > >>>> particular, are EmailSenderProfile and Reputation in 1-1 > correspondence? > >>>> > >>> > >>> I will draw a ER diagram for the domain entities and we can enhance it > >>> over discussions. > >>> Yes I pictured EmailSenderProfile as the representation of an email > sender > >>> (a contact) and each email sender will have a corresponding reputation > >>> score (accumulated and normalized reputation-score over the emails > sent by > >>> him) represented by the Reputation domain entity. > >>> > >>> > >>>> > >>>> * In your timeline I noticed you said "Commit all code to github", > only > >>>> on Aug 11. It's much better practice (and will help mentors guide > you) if > >>>> you commit changes as you go. That way it's also safely backed up, > and you > >>>> can go back in time if you mess up. > >>>> > >>> > >>> Yes I agree, in fact I didn't mean I'm going to commit all code at once > >>> only on Aug 11. I meant to say I'm planning to finish development and > >>> commit everything by Aug 11. > >>> I strongly agree on getting feedback along the way of development, > after > >>> all I'm looking at using agile development for my project :). Sorry for > >>> having interpreted my idea in a misleading way on the proposal. > >>> > >>>> > >>>> * You might also want to version control the academic paper, too, if > your > >>>> university lets you. > >>>> > >>>> > >>>> Some further points relating to the design: > >>>> > >>>> * You have Email as a persistent entity. I'm a bit worried what that > >>>> might mean about storage and also synchronization. Is it necessary > to have > >>>> the Email persisted in Isis? If not persisted, then should the Email > >>>> entity be a view model, or as a fake persistent entity utilizing a new > >>>> StoreManager impl in JDO. See the recent thread [3] on this topic. > >>>> > >>>> Email entity will have several attributes such as : id, sender-id, > >>> reputation-score. sender-id will be mapped to the EmailSenderProfile > and > >>> reputation-score will be a score given by the ML process evaluating the > >>> reputation of the email. Could email-entity be a view model in this > >>> scenario? If so what is the advantage of defining it as a view-model? > >>> > >>> I think we can discuss more on this with a ER diagram for the > application. > >>> I will come up with a ER diagram asap. > >>> > >>> > >>>> * Conversely, does Mahout require some sort of persistent dataset of > >>>> emails in order to do the reputation scoring? Or does it just hold > >>>> aggregated information? If the former, I worry that we now have each > email > >>>> stored in potentially 3 places: gmail, Isis and Mahout. Keeping > these in > >>>> sync would be a nightmare. > >>>> > >>> > >>> AFAIK Mahout process requires a persistent dataset (file based or > database > >>> based) to train the classifier and it will build a classifier-model (an > >>> aggregated information structure on how to classify new data). Mahout > will > >>> not persist email data again. > >>> Therefore I feel Mahout will need access to the email dataset either > >>> straight from gmail as the datasource of from a Isis datasource (after > >>> retrieving all Emails to Isis). > >>> If you think retrieving and storing all emails in Isis is not a good > idea, > >>> maybe the EmailService can be implemented only as a connector from > gmail > > >>> mahout. > >>> > >>>> > >>>> * It occurs to me that you're going to need some entities to keep > track > >>>> of the high water mark of the most recently analyzed email, so that > when > >>>> you poll for new emails you know which to ask for. This high water > mark is > >>>> per user of RB. So I think you'll either need an entity to represent > your > >>>> RB User, or you could use the UserSettings service [4][5] > >>>> > >>> > >>> Yes I will definitely need to have an entity to represent the RB User. > In > >>> fact User management aspect will also be key in the application since > one > >>> user should not be able to access the other's email, reputation data. > >>> Thanks for the suggestions. Will it be a good idea to extend the > >>> UserSettings entity to represent RB specific user data or have a > separate > >>> entity for RB_User? > >>> > >>> > >>>> > >>>> * In the proposal there's the term "reputation index" is associated > with > >>>> the email sender. Is that the same as "Reputation". > >>>> > >>> > >>> Yes. I wanted to imply initial reputation analysis process will > generate > >>> the initial reputation scores for all past emails and create Reputation > >>> profiles for each EmailSender by saying "building the reputation index" > >>> > >>>> > >>>> * The initial download of emails for analysis probably needs to be > done > >>>> using a multiple batches (of say 100 at a time), in case there's a > >>>> glitch/network issue. > >>>> > >>> > >>> Agreed. I think the Isis BackgroundService can be used for this? > >>> > >>>> > >>>> > >>>> * I was interested to note that you see the Isis webapp as being an > email > >>>> client itself. I suggest you keep it as read-only, though... > otherwise > >>>> you'll end up reinventing all of gmail (not advisable, think). > >>>> > >>> > >>> Yes, I would have the webapp as a readonly and demo purpose > application. > >>> basically as a presentation layer of the viewmodel : > >>> EmailReputationViewModel to display the recent emails and their > repuation > >>> information as well as reputation profiles of the email senders. > >>> > >>>> > >>>> * One of the first tasks you've set yourself (til 21 Apr) is to "try > out > >>>> Apache wicket samples [10] to learn how to develop the presentation > layer > >>>> of the application". In fact, with Isis you don't need to do any > >>>> presentation layer coding; start building out your prototype and > you'll see > >>>> what I mean. > >>>> > >>> > >>> I wanted to try out Apache wicket to get an understanding of the Wicket > >>> configurations, programming model to develop view-models. :) > >>> > >>>> > >>>> * I'm still unsure about oAuth integration. The EmailService is > going to > >>>> require credentials to access gmail, and that's "within" the Isis > domain > >>>> model. But Shiro/buji-pac4j sits in front of Isis. If Shiro has > done the > >>>> oAuth sign-in, then I guess it'll be necessary to surface those > credentials > >>>> somehow to the EmailService (perhaps using Shiro's > >>>> org.apache.shiro.SecurityUtils#getSubject() method. Perhaps the best > thing > >>>> is to get buji-pac4j done, then see what information is surfaced that > way. > >>>> > >>> Yes, this requires some bit of research. I wanted to implement RB as a > >>> webapplication which doesn't ask the user's email credentials to > perform > >>> the reputation analysis process. In the worst-case it will require the > >>> user's email credentials to perform the EmailService's email retrieval > >>> process. > >>> > >>> In summary, thanks a lot for your insight into the project. I will > setup a > >>> github project and come up with an ER diagram asap. > >>> > >>> Thanks, > >>> Dileepa > >>> > >>>> > >>>> > >>>> HTH > >>>> Dan > >>>> > >>>> [1] http://pages.github.com/ > >>>> [2] http://yuml.me/ > >>>> [3] http://isis.markmail.org/thread/lsg3uywlfjviztzi > >>>> [4] http://isis.apache.org/reference/services/settings-services.html > >>>> [5] > >>>> > http://isis.apache.org/components/objectstores/jdo/services/settings-services-jdo.html > >>>> > >>>> > >>> > > > > > > Óscar Bou Bou > > Responsable de Producto > > Auditor Jefe de Certificación ISO 27001 en BSI > > CISA, CRISC, APMG ISO 20000, ITIL-F > > > > <contactenos.html.gif> 902 900 231 / 620 267 520 > > <Pasted Graphic 1.tiff> http://www.twitter.com/oscarbou > > > > <gesdatos-software.gif> http://es.linkedin.com/in/oscarbou > > > > <blog.png> http://www.GesConsultor.com > > > > <gesconsultor_logo_blue_email.png> > > > > > > Este mensaje y los ficheros anexos son confidenciales. Los mismos > contienen información reservada que no puede ser difundida. Si usted ha > recibido este correo por error, tenga la amabilidad de eliminarlo de su > sistema y avisar al remitente mediante reenvío a su dirección electrónica; > no deberá copiar el mensaje ni divulgar su contenido a ninguna persona. > > Su dirección de correo electrónico junto a sus datos personales constan > en un fichero titularidad de Gesdatos Software, S.L. cuya finalidad es la > de mantener el contacto con Ud. Si quiere saber de qué información > disponemos de Ud., modificarla, y en su caso, cancelarla, puede hacerlo > enviando un escrito al efecto, acompañado de una fotocopia de su D.N.I. a > la siguiente dirección: Gesdatos Software, S.L. , Paseo de la Castellana, > 153 bajo - 28046 (Madrid), y Avda. Cortes Valencianas num. 50, 1ºC - 46015 > (Valencia). Asimismo, es su responsabilidad comprobar que este mensaje o > sus archivos adjuntos no contengan virus informáticos, y en caso que los > tuvieran eliminarlos. > > > > > > > > > > > > > > > > >
