Here is a link: http://learn.github.com/p/intro.html
On Wed, Jun 13, 2012 at 1:56 AM, Elham Hormozi <[email protected]>wrote: > Hello > Excuse me, I don't understand github exactly! > Can you explain about that? > > Regards > > > > > > > > > > On Sun, Jun 10, 2012 at 9:48 AM, Ted Dunning <[email protected]> > wrote: > > > On Sat, Jun 9, 2012 at 9:03 PM, Elham Hormozi <[email protected] > > >wrote: > > > > > *- Can you say more about this?* > > > AIRS is a classification algorithm based on artificial immune system. I > > > will use it for credit card fraud detection in which it is proved to > > have a > > > good performance. This is an academic project. > > > > > > > Great. > > > > That probably means that you should build your project as an independent > > project using some repository like github. Mahout is available as a > Maven > > dependency so that is probably a good way to go for building your > project. > > > > > > > *- Perhaps define what the airs algorithm does? (it looks like a > variant > > > on k-nn algorithm)* > > > AIRS generates a set of memory cells in training phase. In > classification > > > phase it uses the generated cells in knn algorithm as > > neighbors.Generating > > > memory cells is based on clonal selection that is introduced in AIS. (I > > can > > > explain more about the details if needed) > > > > > > > No need. This disambiguates what you were talking about. > > > > > > > *- How do you plan to add it?* > > > The code is ready. AIRS has been implemented in java and been tested > > > before. We have used MapReduce in training phase and ran it on virtual > > > nodes with no problem. I've already downloaded Mahout source and ran > > KMeans > > > and Bayes via source with no problem. Now I want to know if it is > > possible > > > to add any new algorithm to mahout, if yes how? Is there any defined > > > structure in which I should implement code, or special functions that > > must > > > be inserted? > > > > > > > The basic requirements at this point are: > > > > - you use the correct style (basically Lucene style) > > > > - you have comprehensive test cases > > > > - you provide good documentation > > > > - the system is based on Mahout libraries and doesn't bring in redundant > > dependencies. > > > > In addition, we have recently started to increase the required level of > > support and adoption that a package needs to have before being added to > > Mahout. An academic project typically doesn't meet either of these > > requirements. > > > > I suggest that you host your project on Github, but discuss it here on > the > > Mahout mailing lists. If, over time, you get some adoption and we see > > ongoing maintenance happening then that might be an appropriate time > > > > * - Will you be maintaining it? Are there users who will be using this > > > algorithm in production?* > > > As mentioned before this is an academic project. Our purpose is mostly > > > about implementing AIRS using MapReduce, as Mahout is a powerful > library > > > we'd like to add our code to this library. > > > > > > > That sounds like a no. > > > > If you aren't willing to support this code, then why do you think that > > others will be willing to? > > > > The key here is that Mahout is going through the process of deleting a > > bunch of code that people don't seem interested in adopting or > maintaining. > > So why should we add more of this kind of code? > > > > > > > * - Why is it needed? (based on *BENCHMARKING THE AIRS ARTIFICIAL > IMMUNE > > > SYSTEM FOR CLASSIFICATION* by van der Putten and Ling, it doesn't > appear > > to > > > offer anything extraordinary)* > > > The goal of project is to measure the performance of AIRS using > MapReduce > > > for credit card fraud detection. It has been shown in some papers that > > AIRS > > > can perform better than some other common algorithms for fraud > detection. > > > Body Immune System is similar to fraud detection system that's why we > > have > > > chosen this algorithm. > > > > > > > That's great. Do the comparison. You don't need to add it to Mahout for > > this. > > > > > > > * - Does it scale? (in particular, will it perform any better than a > > simple > > > k-nn based on better known algorithms)* > > > Considering fraud detection particularly, yes it does perform better. > > > > > > > I thought that you haven't done the comparison yet. How do you know that > > it works better? > > > > > > -- > Best Regard > Elham Hormozi >
