Re: [ANN] Introducing Apache Agora - reloaded!
Replying to the community list as requested... Neat app! Not immediately intuitive as to how to interpret it, but with a little experimentation I could see patterns. For example, it was interesting to notice how my email moved from the outskirts of the circle with data from early months to the center of the circle in later months (for the projects I'm involved with). I'm still unclear on what to look for in terms of community health. What are some of the general macro patterns you've seen with this tool? What insight does this provide into the community? The docs provide a good micro level description of how the app models the relationships between individuals, but don't discuss the macro patterns that emerge. It'd be interesting to hear some of your thoughts. Best, WILL - Original Message - From: Stefano Mazzocchi [EMAIL PROTECTED] To: Apache Committers [EMAIL PROTECTED] Sent: Thursday, July 14, 2005 12:14 PM Subject: [ANN] Introducing Apache Agora - reloaded! NOTE: please excuse the noise if you are not interested, but there is no easier way to reach all of you and I thought many of you might be interested in this. hat type=director mode=off A few years ago, around the time the incubator started to appear as the escape valve for the growth problems that some projects were exhibiting, I started to wonder if there could be a way, for those mentoring and providing oversight for particular projects, to make their job easier, especially if they were not participating in the day-to-day work of the various communities they were helping grow strong and self-sufficient. The task is very difficult, not only due to the nature of the problem (and the unstructuredness of the data), but also about the fact that you don't want to create more problems that you are solving: for example, you won't want people to feel spied or abused by numerical rating and rankings. The result of that thinking was Apache Agora, a system that I designed and implemented 3 years ago and that has been running (quite silently) on Nagoya since then. Since Nagoya is going away, I moved Agora over to minotaur and I have aligned it with the existing mail archive (the same one that we use to power our official mod_mbox based archives). Find it at +---+ | | | http://people.apache.org/~stefano/agora/ | | | +---+ what is this? - Agora is a community visualizer. If you wonder who is the core of a particular community (for example, to know who to ask for something) or how big/active/diverse/balanced a community is, Agora is for you. how does it work? - Agora is composed of two pieces: 1) a python scripts that reads mbox files and generates 'precooked' data 2) a java applet that reads the precooked data and visualizes it the script is running every week (on sundays) on minotaur and it's fully incremental, meaning that knows where it lefts off the week before. how about the network? -- The network is created by harvesting the email addresses and linking them depending on the fact that one address replied to a message sent by another address. I say address because an address is not a person, as there might be several addresses belonging to the same person (and no, the system doesn't (yet) allow different addresses that belong to the same person to be smooshed together) In order to reduce noise, the network is the pruned. All addresses that only received or sent email are removed from the graph. So, the resulting graph is a smaller version of those nodes that exhibit minimal connectivity characteristics (and helps to remove, for example, agents like bugzilla or SVN or spam, that never reply, or lurkers that don't participate in discussions). how do I start using it? The tree on the left lists all the 'precooked data' that agora is able to understand. This is a mirror of the list of the folders in /home/apmail/public-arch on minotaur.apache.org and will be automatically updated when new mail lists will be added (so infra@ nor I have to do anything! you can always count on my lazy ass ;-) In order to see anything, you have to click on one of the files on the tree, wait for a few seconds (until the file icon turns reddish) and then click on the load button. This will load the data, create the network, perform the pruning and show it in the graph pane. cool, I have a graph, now what? --- Click the start button and the graph will clusterize. If you merged data from different mailing lists, you will see them forming different groups. If you click on a node, it will show the address related to that node. if you right-click anywhere, a fisheye zoom
Re: [ANN] Introducing Apache Agora - reloaded!
How would you compare it against Microsoft's Netscan (http://netscan.research.microsoft.com/Static/Default.asp) ? which also tries to find the main contributors in different communities. Is 'agora' public knowledge? what does the 'decay' area do? How does one differentiate between a useful communication and a flame war? I remember seeing Mark Smith (the netscan developer) talk about how he could identify the different types via the length of the conversation. Overall a big '+1' Will Glass-Husain wrote: Replying to the community list as requested... Neat app! Not immediately intuitive as to how to interpret it, but with a little experimentation I could see patterns. For example, it was interesting to notice how my email moved from the outskirts of the circle with data from early months to the center of the circle in later months (for the projects I'm involved with). I'm still unclear on what to look for in terms of community health. What are some of the general macro patterns you've seen with this tool? What insight does this provide into the community? The docs provide a good micro level description of how the app models the relationships between individuals, but don't discuss the macro patterns that emerge. It'd be interesting to hear some of your thoughts. Best, WILL - Original Message - From: Stefano Mazzocchi [EMAIL PROTECTED] To: Apache Committers [EMAIL PROTECTED] Sent: Thursday, July 14, 2005 12:14 PM Subject: [ANN] Introducing Apache Agora - reloaded! NOTE: please excuse the noise if you are not interested, but there is no easier way to reach all of you and I thought many of you might be interested in this. hat type=director mode=off A few years ago, around the time the incubator started to appear as the escape valve for the growth problems that some projects were exhibiting, I started to wonder if there could be a way, for those mentoring and providing oversight for particular projects, to make their job easier, especially if they were not participating in the day-to-day work of the various communities they were helping grow strong and self-sufficient. The task is very difficult, not only due to the nature of the problem (and the unstructuredness of the data), but also about the fact that you don't want to create more problems that you are solving: for example, you won't want people to feel spied or abused by numerical rating and rankings. The result of that thinking was Apache Agora, a system that I designed and implemented 3 years ago and that has been running (quite silently) on Nagoya since then. Since Nagoya is going away, I moved Agora over to minotaur and I have aligned it with the existing mail archive (the same one that we use to power our official mod_mbox based archives). Find it at +---+ | | | http://people.apache.org/~stefano/agora/ | | | +---+ what is this? - Agora is a community visualizer. If you wonder who is the core of a particular community (for example, to know who to ask for something) or how big/active/diverse/balanced a community is, Agora is for you. how does it work? - Agora is composed of two pieces: 1) a python scripts that reads mbox files and generates 'precooked' data 2) a java applet that reads the precooked data and visualizes it the script is running every week (on sundays) on minotaur and it's fully incremental, meaning that knows where it lefts off the week before. how about the network? -- The network is created by harvesting the email addresses and linking them depending on the fact that one address replied to a message sent by another address. I say address because an address is not a person, as there might be several addresses belonging to the same person (and no, the system doesn't (yet) allow different addresses that belong to the same person to be smooshed together) In order to reduce noise, the network is the pruned. All addresses that only received or sent email are removed from the graph. So, the resulting graph is a smaller version of those nodes that exhibit minimal connectivity characteristics (and helps to remove, for example, agents like bugzilla or SVN or spam, that never reply, or lurkers that don't participate in discussions). how do I start using it? The tree on the left lists all the 'precooked data' that agora is able to understand. This is a mirror of the list of the folders in /home/apmail/public-arch on minotaur.apache.org and will be automatically updated when new mail lists will be added (so infra@ nor I have to do anything! you can always count on my lazy ass ;-) In order to see anything, you have to click on one of the files on the tree, wait
Re: [ANN] Introducing Apache Agora - reloaded!
Will Glass-Husain wrote: Replying to the community list as requested... Thank you. Neat app! Not immediately intuitive as to how to interpret it, but with a little experimentation I could see patterns. For example, it was interesting to notice how my email moved from the outskirts of the circle with data from early months to the center of the circle in later months (for the projects I'm involved with). I'm still unclear on what to look for in terms of community health. eheh, I'm not sure either :-) What are some of the general macro patterns you've seen with this tool? First of all, the 'size' of the pruned graph is generally a good sign because it means there is less chance of a few key players moving out of the project and leaving the social network disconnected. Another interesting thing is that the people at the center are actually the people I expect to be there. In projects that I follow, I was hardly ever surprised: the distance of their node from the 'center of social gravity' of the community was always (and I mean *always*) reasonable. I don't know about the projects that I don't follow, but I've never heard anybody complain. I also found out to be very effective in understanding how much traction/influence a person might have in a community by dragging his node. Sometimes, if more people are involved in a discussion, I pull their nodes apart and see where the center of gravity shifts. Normally the result of the discussion tends to settle toward the person that moved more the graph. This is amazing, because agora does *NOT* even try to understand what the messages say, but only that the message did happen. I suspect there is a deep reason for the apparent incredible signal: in well behaving communities, people do not reply if they don't have anything to say. I suspect Agora would fail miserably to be as effective in disfunctional communities where people keep emailing eachother with flamewars. Luckily, this is rarely the case in the foundation. What insight does this provide into the community? The docs provide a good micro level description of how the app models the relationships between individuals, but don't discuss the macro patterns that emerge. It'd be interesting to hear some of your thoughts. I wrote this years ago, as an experiment. Then I started to use it more and more as a 'telescope' to look at communities that I didn't know, to understand who were the key players in that communities or, if I heard something worrysome about somebody, whether or not to worry that it could have a big impact on a particular community. Unfortunately, this came before the incubator was setup, so the mail archive on nagoya, who was based on eyebrowse, was kinda left alone and a lot of the mailing lists were not there. Some people from the incubator wanted to evaluate the growth of the project with Agora, but they couldn't. There seems to be a lot of information in there. I have my own way of using it but I don't know if it's a general rule and I don't want people to think that their project is better than another just because their graph is bigger or more densly connected. But it is fascinating to compare different mailing lists, especially over time. For example, whether or not 'dev' is more or less densily connected than 'users'. And it's also very useful to understand the 'bridges', the people that write email in more than one mailing list, those are very important people for the ASF, as they bring crosspollination and allow information to flow thru the various islands (and improves our ability to evolutionarely adapt to change in the technical and social ecosystem). It's a social telescope. And normally it's a lot of fun to use telescopes, even if you don't understand everything about the why the stars and galaxies are they way they are. I feel the same way about Agora: you don't have to have a model of what is happening absolutely, as long as you can spot differences between various projects. But I don't know the metric for community health and I don't think such a thing even exists, so if that's what you are looking for, you are not going to get it from Agora (nor anything I do). -- Stefano. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [ANN] Introducing Apache Agora - reloaded!
Ian Holsman wrote: How would you compare it against Microsoft's Netscan (http://netscan.research.microsoft.com/Static/Default.asp) ? which also tries to find the main contributors in different communities. I think main implies metrics and I really didn't want to go there. I think contribution is inversily proportional to the distance from the center of gravity of the group, but I wanted to keep it subjective to avoid building altars than that people want to fight to step on. Is 'agora' public knowledge? what does the 'decay' area do? How does one differentiate between a useful communication and a flame war? I remember seeing Mark Smith (the netscan developer) talk about how he could identify the different types via the length of the conversation. Overall a big '+1' Will Glass-Husain wrote: Replying to the community list as requested... Neat app! Not immediately intuitive as to how to interpret it, but with a little experimentation I could see patterns. For example, it was interesting to notice how my email moved from the outskirts of the circle with data from early months to the center of the circle in later months (for the projects I'm involved with). I'm still unclear on what to look for in terms of community health. What are some of the general macro patterns you've seen with this tool? What insight does this provide into the community? The docs provide a good micro level description of how the app models the relationships between individuals, but don't discuss the macro patterns that emerge. It'd be interesting to hear some of your thoughts. Best, WILL - Original Message - From: Stefano Mazzocchi [EMAIL PROTECTED] To: Apache Committers [EMAIL PROTECTED] Sent: Thursday, July 14, 2005 12:14 PM Subject: [ANN] Introducing Apache Agora - reloaded! NOTE: please excuse the noise if you are not interested, but there is no easier way to reach all of you and I thought many of you might be interested in this. hat type=director mode=off A few years ago, around the time the incubator started to appear as the escape valve for the growth problems that some projects were exhibiting, I started to wonder if there could be a way, for those mentoring and providing oversight for particular projects, to make their job easier, especially if they were not participating in the day-to-day work of the various communities they were helping grow strong and self-sufficient. The task is very difficult, not only due to the nature of the problem (and the unstructuredness of the data), but also about the fact that you don't want to create more problems that you are solving: for example, you won't want people to feel spied or abused by numerical rating and rankings. The result of that thinking was Apache Agora, a system that I designed and implemented 3 years ago and that has been running (quite silently) on Nagoya since then. Since Nagoya is going away, I moved Agora over to minotaur and I have aligned it with the existing mail archive (the same one that we use to power our official mod_mbox based archives). Find it at +---+ | | | http://people.apache.org/~stefano/agora/ | | | +---+ what is this? - Agora is a community visualizer. If you wonder who is the core of a particular community (for example, to know who to ask for something) or how big/active/diverse/balanced a community is, Agora is for you. how does it work? - Agora is composed of two pieces: 1) a python scripts that reads mbox files and generates 'precooked' data 2) a java applet that reads the precooked data and visualizes it the script is running every week (on sundays) on minotaur and it's fully incremental, meaning that knows where it lefts off the week before. how about the network? -- The network is created by harvesting the email addresses and linking them depending on the fact that one address replied to a message sent by another address. I say address because an address is not a person, as there might be several addresses belonging to the same person (and no, the system doesn't (yet) allow different addresses that belong to the same person to be smooshed together) In order to reduce noise, the network is the pruned. All addresses that only received or sent email are removed from the graph. So, the resulting graph is a smaller version of those nodes that exhibit minimal connectivity characteristics (and helps to remove, for example, agents like bugzilla or SVN or spam, that never reply, or lurkers that don't participate in discussions). how do I start using it? The tree on the left lists all
Re: [ANN] Introducing Apache Agora - reloaded!
Stefano Mazzocchi wrote: Ian Holsman wrote: How would you compare it against Microsoft's Netscan (http://netscan.research.microsoft.com/Static/Default.asp) ? which also tries to find the main contributors in different communities. I think main implies metrics and I really didn't want to go there. I think contribution is inversily proportional to the distance from the center of gravity of the group, but I wanted to keep it subjective to avoid building altars than that people want to fight to step on. sorry, hit sent too soon. Is 'agora' public knowledge? no 'private' mail list is being analyzed, so yes, it's public knowledge. it has not been largerly publicized (yet) but I wouldn't be against putting it in a more visible position on the apache.org web site. what does the 'decay' area do? if you do one reply to a message of mine, agora creates a link between you and me of strenght 1.0, then if you do another reply this gets added. Note that links are directional: you might reply a lot to me, but I never reply to you, this is still calculated in the graph drawing algorithm. Decay means that you get 1.0 if you reply now and exponentially lower value if your reply was earlier in time. I introduced this because I was curious about how much the past of a project (especially if you load a lot of months of a project in memory) was influencing its present. Rather surprisingly, decay does *NOT* introduce substantial difference in the way the graph is shaped or the position of people in the graph, which is a very very interesting property and I have no idea why that is the case. How does one differentiate between a useful communication and a flame war? There is no attempt to do, ATM. I remember seeing Mark Smith (the netscan developer) talk about how he could identify the different types via the length of the conversation. As I mentioned earlier, we don't tend to host a lot of inflammatory people in Apache (don't really know why, I suspect is an historical thing or avoiding to react agressively to aggressions, which make flamelovers go somewhere else, but I don't know how to test this hypothesis), this keeps the signal/noise ratio high. Identifying a conversation means that at least *you* can pretend to understand the difference between inflammatory and not. I suspect this difference is also very cultural: a conversation that is a 'normal' tone in one community might be considered very 'strident' in another. I'm sure I'm not the only one who has experienced this. At the end of the day, I'm a big fan of the love/hate hypothesis: replying to somebody indicates a sort of preferential attachment, no matter what you are saying. Ignoring them is the only signal that the communication is not useful. NOTE: I do *not* think that the size of the social cluster is an indication of health, there is something else that influences it... but I don't know what it is (yet). Overall a big '+1' Thanks. -- Stefano. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]