On Mon 26 Mar, 2018, 09:57 Saahil Sirowa, <[email protected]> wrote:
> Hi Kevin and SpamAssassin Dev Community, > Which one would be better for testing mechanisms; Travis CI or Cmake. > > Thanks... > Saahil Sirowa > Indian Institute of Technology Hyderabad > B. Tech Computer Science and Engineering > > On Mon 26 Mar, 2018, 07:29 Saahil Sirowa, <[email protected]> > wrote: > >> Hi Kevin, >> I know you have already gone through the proposal once. But, I still >> request you to go through it. Your suggestions in this final phase will >> prove valuable. >> >> Awaiting for a favorable response. >> >> I intentionally didn't sent this mail in dev mailing list. >> >> Thanks... >> Saahil Sirowa >> B. Tech Computer Science and Engineering >> Indian Institute of Technology, Hyderabad >> >> On Mon, Mar 26, 2018 at 7:24 AM, Saahil Sirowa <[email protected] >> > wrote: >> >>> Hi Kevin and Spam Assassin Dev Community, >>> I have made some changes in the draft. >>> GSoC 2018 Proposal >>> <https://docs.google.com/document/d/1-OCNv79sHvVViKwnrRYtlMiKWLCzz4xUW4tNOlmaTmw/edit?usp=sharing> >>> >>> I request you all to rigorously review it and suggest appropriate edits. >>> As, this is the final phase of the application period(Deadline 27th March >>> 16:00 UTC), I would really appreciate it If you respond before this. This >>> will help me in incorporating the suggested changes in time. >>> >>> Thanks... >>> Saahil Sirowa >>> B. Tech Computer Science and Engineering >>> Indian Institute of Technology, Hyderabad >>> >>> >>> On Fri, Mar 23, 2018 at 7:55 PM, Saahil Sirowa < >>> [email protected]> wrote: >>> >>>> I had some in last 2-3 days. I will update the proposal draft with >>>> required changes by tomorrow night(Sat night). >>>> >>>> Thanks... >>>> Saahil Sirowa >>>> B. Tech Computer Science and Engineering >>>> Indi@n Institute of Technology, Hyderabad >>>> >>>> On Fri 23 Mar, 2018, 18:01 Kevin A. McGrail, <[email protected]> >>>> wrote: >>>> >>>>> Wanted to check in and see how you are doing. THis blog post has >>>>> gotten some praise >>>>> >>>>> >>>>> https://medium.com/@owtf/google-summer-of-code-writing-a-good-proposal-141b1376f076 >>>>> . >>>>> >>>>> -- >>>>> Kevin A. McGrail >>>>> Asst. Treasurer & VP Fundraising, Apache Software Foundation >>>>> Chair Emeritus Apache SpamAssassin Project >>>>> https://www.linkedin.com/in/kmcgrail - 703.798.0171 >>>>> >>>>> On Wed, Mar 21, 2018 at 7:52 AM, Kevin A. McGrail <[email protected] >>>>> > wrote: >>>>> >>>>>> Comments allowed might be helpful though :-) >>>>>> >>>>>> -- >>>>>> Kevin A. McGrail >>>>>> Asst. Treasurer & VP Fundraising, Apache Software Foundation >>>>>> Chair Emeritus Apache SpamAssassin Project >>>>>> https://www.linkedin.com/in/kmcgrail - 703.798.0171 >>>>>> <(703)%20798-0171> >>>>>> >>>>>> On Wed, Mar 21, 2018 at 12:36 AM, Rajkiran Rajkumar < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> @Saahil, kindly make your doc view-only for people with a link to >>>>>>> it. Giving edit permissions to the world is a bad idea. >>>>>>> >>>>>>> Thanks, >>>>>>> Rajkiran >>>>>>> >>>>>>> On Tue, Mar 20, 2018 at 5:17 PM, Kevin A. McGrail < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> +users >>>>>>>> >>>>>>>> All we give is feedback. The submission to GSoC is what matters. >>>>>>>> So if you mentioned perl here that's not going to carryover to the >>>>>>>> reviewers. >>>>>>>> >>>>>>>> Can someone with fresh eyes take a look at this? I read it too >>>>>>>> recently so I will gloss over it too much. >>>>>>>> >>>>>>>> Here are some posts the mentors list thought might be helpful. The >>>>>>>> first I believe covers someone's pov who did not get selected. >>>>>>>> >>>>>>>> >>>>>>>> https://medium.freecodecamp.org/hacking-gsoc-how-to-gain-real-life-experience-and-support-open-source-b1e6a664f6e4?source=linkShare-53ba2bb84284-1521381334 >>>>>>>> >>>>>>>> https://sanatt.me/2017/12/30/cracking-google-summer-code-2018/ >>>>>>>> >>>>>>>> Regards, KAM >>>>>>>> >>>>>>>> On Tue, Mar 20, 2018, 03:57 Saahil Sirowa < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Kevin and Apache SpamAssassin Dev Community, >>>>>>>>> >>>>>>>>> I have resolved all the changes you suggested in the previous >>>>>>>>> draft. >>>>>>>>> 1) I mentioned about learning PERL a week before the community >>>>>>>>> bonding period. It will not take much time. I can assure you that >>>>>>>>> language >>>>>>>>> is not going to be an issue. >>>>>>>>> 2) I updated the biography part a bit >>>>>>>>> 3) Significant changes have been made in the Timeline. >>>>>>>>> 4) I'm planning to used cmake/travis ci for automated testing. If >>>>>>>>> there is a better alternative please do suggest. >>>>>>>>> 5) I gave links to research papers that i will be reading in the >>>>>>>>> timeline. >>>>>>>>> 6) I updated the timeline by mentioning to gain advanced >>>>>>>>> information about email traffic and spams. I listed some links for the >>>>>>>>> purpose. >>>>>>>>> 7) I updated the credits >>>>>>>>> 8) There are other changes made in various parts of proposal. >>>>>>>>> >>>>>>>>> Thanks for your previous detailed feedback. >>>>>>>>> >>>>>>>>> Here is link to the updated proposal >>>>>>>>> GSoC 2018 proposal >>>>>>>>> <https://docs.google.com/document/d/1-OCNv79sHvVViKwnrRYtlMiKWLCzz4xUW4tNOlmaTmw/edit#heading=h.q7h3lddabdvh> >>>>>>>>> Please rigorously review it and suggest any changes that I should >>>>>>>>> make. >>>>>>>>> >>>>>>>>> Awaiting for a favorable response. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks... >>>>>>>>> Saahil Sirowa >>>>>>>>> B. Tech Computer Science and Engineering >>>>>>>>> Indian Institute of Technology, Hyderabd >>>>>>>>> >>>>>>>>> On Mon, Mar 19, 2018 at 3:27 AM, Kevin A. McGrail < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi Saahil >>>>>>>>>> >>>>>>>>>> re: Perl. As the project is primarily in Perl and you do not list >>>>>>>>>> that in your Proficiencies or any similar languages like PHP, I would >>>>>>>>>> address that. The word Perl does not appear a single time. >>>>>>>>>> >>>>>>>>>> Your Biography is a little light on why this is something you >>>>>>>>>> feel you can implement. The mentors will likely NOT be able to help >>>>>>>>>> you >>>>>>>>>> with the science rather focusing on the community, processes, and >>>>>>>>>> open >>>>>>>>>> source in general. >>>>>>>>>> >>>>>>>>>> re: Email and SPam, do you have any experience with email traffic >>>>>>>>>> or spam? if so, add it. If not, explain what you plan to do to >>>>>>>>>> address >>>>>>>>>> that. >>>>>>>>>> >>>>>>>>>> Re: Deliverables, I think you'll need to propose the first draft >>>>>>>>>> of that. But your goal will likely be a plugin for Apache >>>>>>>>>> SpamAssassin >>>>>>>>>> that can be installed and configured to provide multiple configurable >>>>>>>>>> statistical analysis algorithms to better identify ham (good email) >>>>>>>>>> and/or >>>>>>>>>> spam (bad email) >>>>>>>>>> >>>>>>>>>> Please use Apache SpamAssassin to properly brand the title. >>>>>>>>>> >>>>>>>>>> Re: I have no input on the scheduling/timelines except that past >>>>>>>>>> proposal I have read have included more phases and do not add >>>>>>>>>> "optional" >>>>>>>>>> items. I'd prefer to see small increments to make sure you stay on >>>>>>>>>> schedule and don't get overwhelmed and find yourself way behind as >>>>>>>>>> the time >>>>>>>>>> progresses. >>>>>>>>>> >>>>>>>>>> Re: Testing Methodology, this is likely the most critical missing >>>>>>>>>> part. I am a fan of test driven development where you set up tests >>>>>>>>>> that >>>>>>>>>> should pass and fall and use continuous testing as you add code to >>>>>>>>>> confirm >>>>>>>>>> your development is progressing well. >>>>>>>>>> >>>>>>>>>> This is especially important because spam analysis often doesn't >>>>>>>>>> work the way people expect and tests w/statistics can help identify >>>>>>>>>> issues. >>>>>>>>>> >>>>>>>>>> For example, this is a hypothesis that this statistical >>>>>>>>>> algorithms will be better than Bayes. So you'll need a baseline for >>>>>>>>>> comparison. >>>>>>>>>> >>>>>>>>>> Additionally, even experts in the field are surprised when they >>>>>>>>>> think something will prove the hamminess of an email but in fact >>>>>>>>>> shows the >>>>>>>>>> opposite. Real world example, SPF is a policy when introduced was >>>>>>>>>> supposed >>>>>>>>>> to allow an automated mechanism that says "this is an email from a >>>>>>>>>> legitimate mail server for my domain". >>>>>>>>>> >>>>>>>>>> However, the FIRST wave of people to adobt it were all spammers. >>>>>>>>>> So it became a spam indicator more than a spam indicator. It was a >>>>>>>>>> very >>>>>>>>>> interesting outcome. >>>>>>>>>> >>>>>>>>>> Re: Corpora, you'll want a corpora of carefully hand sorted ham >>>>>>>>>> and spam. Have you thought about how you'll get that? I *might* be >>>>>>>>>> able >>>>>>>>>> to help but it's 50/50. >>>>>>>>>> >>>>>>>>>> Re: You mention reading research papers on statisical algorithms >>>>>>>>>> from a previous proposal. You'll want to list them to show which >>>>>>>>>> ones you >>>>>>>>>> plan to study >>>>>>>>>> >>>>>>>>>> re: "Discussions with the SA community regarding the various >>>>>>>>>> types of spams that the present SA can handle." is unclear. What is >>>>>>>>>> a >>>>>>>>>> "type of spam" to you? Do you have a list of types of spam? >>>>>>>>>> >>>>>>>>>> re: "Brainstorming with the mentors and SA community about the >>>>>>>>>> various input features and parameters that can have a huge impact on >>>>>>>>>> the >>>>>>>>>> overall performance of the listed neural nets models." I think this >>>>>>>>>> is >>>>>>>>>> flawed. There won't be a ton of people who can discuss this with >>>>>>>>>> you. >>>>>>>>>> You'll need to likely use scientific process to show what has a >>>>>>>>>> performance >>>>>>>>>> impact. This is not busy work or school work. This is an >>>>>>>>>> experiment that >>>>>>>>>> has not been tried at the SA project. >>>>>>>>>> >>>>>>>>>> re: "actively involved with the community." is a stretch. A few >>>>>>>>>> emails do not active involvement make. >>>>>>>>>> >>>>>>>>>> re: Bonding, you might consider raising that to 1-2 major bugs >>>>>>>>>> and 10-20 minor bugs. >>>>>>>>>> >>>>>>>>>> Re: Credits/references, I would add more clarity about where each >>>>>>>>>> of those references are used. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> KAM >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>> >>
