Re: [tor-dev] Python ExoneraTor

2014-06-10 Thread Kostas Jakeliunas
On Tue, Jun 10, 2014 at 10:38 AM, Karsten Loesing
 wrote:
> On 10/06/14 05:41, Damian Johnson wrote:
 let me make one remark about optimizing Postgres defaults: I wrote quite
 a few database queries in the past, and some of them perform horribly
 (relay search) whereas others perform really well (ExoneraTor).  I
 believe that the majority of performance gains can be achieved by
 designing good tables, indexes, and queries.  Only as a last resort we
 should consider optimizing the Postgres defaults.

 You realize that a searchable descriptor archives focuses much more on
 database optimization than the ExoneraTor rewrite from Java to Python
 (which would leave the database untouched)?
>>>
>>> Are other datastore models such as splunk or MongoDB useful?
>>> [splunk has a free yet proprietary limited binary... those having
>>> historical woes and takebacks, mentioned just for example here.]
>>
>> Earlier I mentioned the idea of Dynamo. Unless I'm mistaken this lends
>> itself pretty naturally to addresses as a hash key, and descriptor
>> dates as the range key. Lookups would then be O(log(n)) where n is the
>> total number of descriptors an address has published (... that is to
>> say very, very quick).
>>
>> This would be a fun project to give Boto a try. *sigh*... there really
>> should be more hours in the day...
>
> Quoting my reply to Damian to a similar question earlier in the thread:
>
>> I'm wary about moving to another database, especially NoSQL ones and/or 
>> cloud-based ones.  They don't magically make things faster, and Postgres is 
>> something I understand quite well by now. [...] Not saying that DymanoDB 
>> can't be the better choice, but switching the database is not a priority for 
>> me.
>
> If somebody wants to give, say, MongoDB a try, I'd be interested in
> seeing the performance comparison to the current Postgres schema.  When
> you do, please consider all three search_* functions that the current
> schema offers, including searches for other IPv4 addresses in the same
> /24 and other IPv6 addresses in the same /48.

Personally, the only NoSQL thing I've come across (and have had some
really good experiences with in the past) was Redis, which is a kind
of key-value store-in-memory, with some nice simple data structures
(like sets, and operations on sets. So if you can reduce your problem
to (e.g.) sets and set operations, Redis might be a good fit.)

(I think that isis is actually experimenting with Redis right now, to
do prop 226-bridgedb-database-improvements.txt)

If the things that you store in Redis can't be made to fit into
memory, you'll probably have a bad time.

So to generalize, if some relational data which needs to be searchable
can be made to fit into memory ("we can guarantee it wouldn't exceed x
GB [for t time]"), offloading that part onto some key-value (or some
more elaborate) system *might* make sense.

Also, I mixed up the link in footnote [2]. It should have linked to
this diagnostic postgres query:

https://github.com/wfn/torsearch/blob/master/misc/list_indexes_in_memory.sql

--

regards
Kostas
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Python ExoneraTor

2014-06-10 Thread Kostas Jakeliunas
Hi all!

On Mon, Jun 9, 2014 at 10:22 AM, Karsten Loesing  wrote:
> On 09/06/14 01:26, Damian Johnson wrote:
>> Oh, and another quick thought - you once mentioned that a descriptor
>> search service would make ExoneraTor obsolete, and in looking it over
>> I agree. The search functionality ExoneraTor provides is trivial. The
>> only reason it requires such a huge database is because it's storing a
>> copy of every descriptor ever made.
>>
>> I suspect the actual right solution isn't to rewrite ExoneraTor at
>> all, but rather develop a new service that can be queried for this
>> descriptor data. That would make for a *much* more worthwhile project.
>>
>> ExoneraTor? Nice to have. Descriptor archive service? Damn useful. :)
>
> I agree, that was the idea behind Kostas' GSoC project last year.  And I
> still think it's a good idea.  It's just not trivial to get right.

Indeed, not trivial at all!

I'll use this space to mention the running metrics archive backend
modulo ExoneraTor stuff / what could be sorta-relevant here.

fwiw, the onionoo-like backend is still running at an obscure address:port:
http://ts.mkj.lt:/

TL;DR "what can I do with that" is: look at:

https://github.com/wfn/torsearch/blob/master/docs/use_cases_examples.md

In particular, regarding ExoneraTor-like queries (incl. arbitrary
subnet / part-of-ip lookups):

https://github.com/wfn/torsearch/blob/master/docs/use_cases_examples.md#exonerator-type-relay-participation-lookup

Not sure if it's worth discussing all the weaknesses of this archive
backend in this thread, but the short relevant version is that the
ExoneraTor-like functionality does mostly work, but I would need to
look into it so see how reliable the results are ("is this relay ip
address field really the one we should be using?", etc.)

But what's nice is that it is possible to do arbitrary queries on all
consensuses since ~2008, with no date specified (if you don't want
to.) (Which is to say, "it's possible", not necessarily "this is the
right way to do the solution for the problems in this thread")

So e.g. this is the ip address where moria runs, and we want to see
what relays have ever run on it:

http://ts.mkj.lt:/details?search=128.31.0.34

Take the fingerprint of the one that is currently running (moria1),
and look up its last 500 statuses (in a kind of condensed/summary
form): 
http://ts.mkj.lt:/statuses?lookup=9695DFC35FFEB861329B9F1AB04C46397020CE31&condensed=true

"from", "to" date ranges can be specified as e.g. 2009, 2009-02,
2009-02-10, 2009-02-10 02:00:00. limit/offset/parameters/etc.
specified here:
https://github.com/wfn/torsearch/blob/master/docs/onionoo_api.md

(Descriptors/digests aren't currently included (I think they used to),
but they can be, etc.)

The point is probably mostly about "this is some evidence that it can be done."
("But there are nuances, things are imperfect, time is needed, etc.")

The question really is regarding the actual scope of this rewrite, I suppose.

I'd probably agree with Karsten that just doing a port of the
ExoneraTor functionality as it currently is on
exonerator.torproject.org would be the safe bet. See how that goes,
venture into more exotic lands later on maybe, etc. (That doesn't mean
that I wouldn't be excited to put the current backend to good use,
and/or use the knowledge I gained to help you folks in some way!)

>
> Regarding your comment about storing a copy of every descriptor ever
> made, I believe that users trust ExoneraTor's results more if they see
> the actual descriptors that lead to results.  Of course, I'm saying that
> without knowing what ExoneraTor users actually want.  But let's not drop
> descriptor copies from the database easily.
>
> And, heh, when you say that the search functionality ExoneraTor provides
> is trivial, a little part of me is dying.  It's the part that spent a
> few weeks on getting the search functionality fast enough for
> production.  That was not at all trivial.  The oraddress24, oraddress48,
> and exitaddress24 fields as well as the indexes are the result of me
> running lots and lots of sample queries and wondering about Postgres'
> EXPLAIN ANALYZE results.  Just saying that it's not going to be trivial
> to generalize the search functionality towards other fields than IP
> addresses and dates.

Hear hear, I can only imagine! These things and exonerator stuff is
not easy to be done in a way that would provide **consistently**
good/great performance.

I spent some days of the last summer also looking at EXPLAIN ANALYZE
results (it was a great feeling to start to understand what they mean
and how I can make them better), but eventually things start making
sense. (And when they do, I also get that same feeling that NoSQL
stuff doesn't magically solve things.)

>
> If others want to follow, here's the SQL code I'm talking about:
>
> https://gitweb.torproject.org/exonerator.git/blob/HEAD:/db/exonerator.sql
>
> So, I'm happy to talk about writing a searchable descriptor a

[tor-dev] Reminder: tor development meetings this (and every) Wednesday, 19:00 UTC

2014-06-10 Thread Nick Mathewson
Here's your regular reminder for the weekly IRC meeting for people
working on the program "tor".  (This won't cover all the other
programs developed under the Tor umbrella.)

The next meeting time will be:

 Wednesday June 11, 19:00 UTC.

(That's 3pm EDT and 12:00 noon PDT.)

As usual, we'll do it on the #tor-dev IRC channel.

I've seen a drop-off in the number of people coming since I stopped
sending these out weekly, so I'm sending them again for a while.

Among other stuff, I expect we'll be wrapping up (or planning to wrap
up) 0.2.5.5-alpha and maybe doing some initial strategizing on 0.2.6.


cheers,
-- 
Nick
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] GSoC: BridgeDB Twitter Distributor report

2014-06-10 Thread Kostas Jakeliunas
Hey all,

in the past weeks I've been working on understanding what can be done
using Twitter APIs and its media support in its CDN (for a later
captcha implementation), as well as on improving my existing Twitter
bridge distributor bot PoC. I've written some broken code, but it's
alright. More details below.


Distributor bot improvements included working on adding a churn rate
control mechanism which securely stores Twitter user IDs (with code
and design ideas from BridgeDB's HMAC approach to remembering e.g.
email addresses in the EmailDistributor), and implementing a (mostly)
bogus text-based challenge-response system (this is mostly so that we
have a generic design for doing challenge-responses in this
distributor - we'll be able to later on replace it with a decent
CAPTCHA, for example. It's just nice to have a generic system and a
thing for testing out the bot, etc.)

I've also looked into using isis' new and shiny BridgeRequest objects
to process user (well) 'bridge requests' in a non-hacky way; this
should also eventually result in a bridge request syntax compatible
with (a subset of) GetTor commands. But I still need to figure out the
best way to use BridgeRequests, so nothing interesting to show yet.

TODO

 * (still yet to) summarize a nice meeting i've had with sysrqb and
isis. No definite conclusions were made, but there were (iirc) some
nice ideas about a generic BridgeDB API that could be used by third
party components, etc. (i.e. it might be worth pursuing it even if the
Social Distributor is to be implemented at some later point.)

 * clean up my mess, test new code not to fail, and push new things
onto https://github.com/wfn/twidibot/ (current (old) code there does
work, if anyone's curious to run it)

 * figure out BridgeRequests, the new IRequestBridges (ha!) interface,
and use these in the twitter bot

 * be able to 'serve' the bot fake bridge data so it could process it
in a way that may be compatible with a future BridgeDB API (i.e.,
hopefully this bot will be able to run as a third-party-thing,
separate from core bridgedb. This is hopefully how future distributors
will/should work.) This way the bot will be more/actually 'realistic'
in the way it serves current bogus bridge lines to users. (I thought
I'd have this by now, but I don't. Hrm.)

 * continue looking into captcha systems modulo what can be used in
the twitter context

 * look into bridgedb buckets and what I can help re: them, so the
bridgedb API could happen sooner than later. (Old todo list item, did
not yet touch it.)

All in all, need to write more non-broken code, fewer words, and just
continue with the current bot.

Have a nice day/night/thing!

--

Kostas.

0x0e5dce45 @ pgp.mit.edu
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] [Tor2web] Proposal for improving social incentives for relay operators

2014-06-10 Thread Moritz Bartl
Hi Virgil,

I think a modified atlas that has a better top relay list and plays with
various (non-financial) gamification concepts is long due. When you look
at BOINC/SETI, it can work well. I agree that by simply interfacing with
onionoo (plus probably some aggregation of data), you can generate a
nice set of views that "give back warm and fuzzy feelings" and
"encourage competition". Diversity should be factored in, something that
we already do partly for the Torservers reimbursements.

I guess $someone should just go ahead and implement something. Hosting
it on some third party domain doesn't hurt, and if it is great, we can
still discuss moving it to something.tpo.org.

-- 
Moritz Bartl
https://www.torservers.net/
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] [Tor2web] Proposal for improving social incentives for relay operators

2014-06-10 Thread Virgil Griffith
General remarks:
* I agree 100% with your Dec 2013 post.
* All data I seek to make available in "Torati" is available from
Onionoo.

The proposal is to interface to Torati is like ATLAS but keyed by Tor
nickname.
* However, where Atlas intends primarily to be a reference, Torati aims
to be social reputation
  incentivization for operators.  So you'd want Torati to be seen by
search engines using the user's
  nickname, e.g.,
  -- https://torati.torproject.org/TORTverLover
* A given nickname's contributions would be the sum across the relays
with that nickname.

Which in for "TORTverLover" would sum the stats across:
   *
https://atlas.torproject.org/#details/F2D3093388925780441433897F497797C5062B0B
   *
https://atlas.torproject.org/#details/A8541EA02D2BBE97086BC7EF44A67E8FDA0C75A9


To answer your questions:

> (The last link is a 404.)

Try: https://dl.dropboxusercontent.com/u/3308162/Iajya%202013.pdf
But the most important papers are the first two I linked.


> Why not make it entirely opt-in?  We could include a subscription
link in Weather's welcome messages that relay operators receive when
their relay first receives the Stable flag.

I greatly prefer opt-out over opt-in.  Even if a Torati operator is in
fact reputation-hungry, I don't want
the opt-in mechanic to encourage him/her to be seen as reputation
hungry.  Moreover, as ATLAS isn't
opt-in so I see no reason to deviate from that precedent as this is
really just a "reverse-lookup" version
of ATLAS.


> Where does the name "Torati" originate from?

 The name "Torati" is a Tor-ified version of "digerati" or
"illuminati".  It's meant to convey something
 along the lines of "Tor Ninja".  It's a positive term that one is
proud to call oneself.  The name was
 chosen as a component of the reputation social incentive.


-Virgil


On Tue, Jun 10, 2014 at 1:19 AM, Karsten Loesing 
wrote:

> [Attempting to move this discussion to tor-dev@ to avoid cross-posting;
> assuming my Reply-To: header won't get eaten by Mailman..]
>
> On 10/06/14 02:26, Virgil Griffith wrote:
> > For a while I've been seeking to grow the Tor network in both size and
> > goodput.  Towards this end, I've explored various avenues such as
> > increasing user-awareness via tor2web.  More recently, I've been
> exploring
> > financial incentives like TorCoin.
> >
> > Not wanting to strictly limit ourselves to financial incentives, I began
> > reading the literature on incentivizing volunteers.  The most relevant
> > papers I found are:
> >
> > *
> http://www-2.rotman.utoronto.ca/facbios/file/LMS2_ManSci-Paper-Final.pdf
> >  * http://pareto.uab.es/~prey/gneezy_254.pdf
> > * https://dl.dropboxusercontent.com/u/3308162/Slonim%202013.pdf
>
> (The last link is a 404.)
>
> > The most relevant of these papers (Lacetera 2013) cites the major
> > motivations for volunteer labor are: "pure altruism, warm glow,
> self-image,
> > and reputation".  Upon reading this I realized TorCoin's technical
> > interestingness had blinded me to much easier to leverage motivations of
> > "warm glow" and "reputation".
> >
> > I propose the following system for harnessing "warm glow" and
> "reputation"
> > for Tor relay operators.  I am willing to fund this in its entirety.
> >
> > I propose establishing a subdomain on torproject.org giving each Tor
> relay
> > operator (hereafter affectionately called "Torati") his/her own page
> using
> > the information her machines provide to the Tor Directory Consensus.  The
> > fields to show on her "Torati profile page" would be things like:
> > ContactInfo, PGP fingerprint, list of server nicknames, date the
> Directory
> > Authorities first saw her contact info, etc.  You can also imagine a
> > receiving special "special recognition stars" for operating an exit or
> > bridge node.  Moreover, some bandwidth measurement like EigenSpeed or
> > TorCoin gain traction, the Torati page could recognize contributors with
> by
> > listing the sum total she has relayed to the Tor network.
> >
> > Naturally a node can opt-out of Torati recognition by setting a parameter
> > in the torrc file.
> >
> > I argue this would be a cheap and easy way to motivate operators to
> > volunteer more bandwidth for the Tor network.  As mentioned before, I am
> > willing to fund this in its entirety.
>
> Hi Virgil,
>
> adding more/better incentives for people to run relays and bridges
> sounds like a great plan!
>
> What you describe sounds related to what I suggested last December on
> this list:
>
> https://lists.torproject.org/pipermail/tor-dev/2013-December/005948.html
>
> > 9. Provide relay comparison metrics in Onionoo.  We could define some
> > simple metrics on the usefulness of a relay, like provided bandwidth or
> > uptime, in comparison to other relays.  A possible statement from these
> > metrics could be: "your relay provides more bandwidth than 95% of relays
> > in the network."  Similar to 8.  If Atlas [6] or Globe [8] or a
> > y

Re: [tor-dev] [GSoC 2014] Revamp GetTor: Send HTTP links for downloading TBB

2014-06-10 Thread Israel Leiva
2014-06-10 0:27 GMT-04:00 :

>
> Hello Israel,
>
>
Hi Michael.




> How do you qualify 'difficult?' Is this a duration matter or are
> there timeouts and repeated stream downloads? Is it a financial
> (money per megaoctet) problem for the users?
>
>
Actually, this is how Nima described it, giving Iran as an example. I don't
have further information on what this implies.




> Do you have statistics of how many users have a good versus bad
> experience and just how much lowering the bar to HTTP would make
> a difference in this regards?
>
>
I don't have any statistics regarding this matter. I was hoping people on
this ML could contribute with real data and/or examples.




> Sorry for so many questions, I'm not in the 'difficult' category
> so have no idea.
>
>
No problem, I'm not in this category either, and asking these questions is
certainly helpful.



Good choice, I hope you get the answer you're looking for.
>
>
Thanks!

-- 
israel
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] GSoC: Multicore daemon status report

2014-06-10 Thread Белоус Михаил
Hi all,


As start of my project, I made patch to work queue, making thread pool more
efficient and wrote some benchmarks.

Future work
1.Rewrite architecture of tor circuit processing, to use thread pool
2.Write benchmarks showing, that multi core implementation efficient, or
not?

Most of my work can be found at https://github.com/towelenee/tor.
Please feel free to provide feedback of any sort.

Cheers,
Mikhail
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Proposal for improving social incentives for relay operators

2014-06-10 Thread Andrew Lewman
On 06/09/2014 08:26 PM, Virgil Griffith wrote:
> For a while I've been seeking to grow the Tor network in both size and
> goodput.  Towards this end, I've explored various avenues such as
> increasing user-awareness via tor2web.  More recently, I've been exploring
> financial incentives like TorCoin.

This is great that you care about growing the Tor network. Thanks for
the thoughts.

However, can we please, please stop using Tor in the name of everything?
Our trademark lawyers love the business, but we'd rather spend money on
developers and improving tor; not defending our name to keep everyone
from being confused as to what is the real Tor or not. People, press,
and companies are already calling us at Tor thinking we wrote torcoin
and have approved it. We did not write torcoin nor do we approve of it
(as far as I know).

> The most relevant of these papers (Lacetera 2013) cites the major
> motivations for volunteer labor are: "pure altruism, warm glow, self-image,
> and reputation".  Upon reading this I realized TorCoin's technical
> interestingness had blinded me to much easier to leverage motivations of
> "warm glow" and "reputation".

Perhaps join the EFF's Tor Challenge? https://www.eff.org/torchallenge/
They would love the help.

-- 
Andrew
pgp 0x6B4D6475
https://www.torproject.org/
+1-781-948-1982
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Python ExoneraTor

2014-06-10 Thread Karsten Loesing
On 10/06/14 18:14, Damian Johnson wrote:
>> ... including searches for other IPv4 addresses in the same
>> /24 and other IPv6 addresses in the same /48.
> 
> Ahhh. *That* would indeed make this a lot more of a pita. ExoneraTor
> gives no indication that it accepts /24 or /48 ranges. Is that
> capability even used by visitors?

ExoneraTor doesn't indicate that, but if a search for a certain IP
address returns no results it looks up nearby addresses.

Example: search for 37.130.227.132 (.133 is TorLand1)

Result:

"""
We did not find IP address 37.130.227.132 in any of the relay or exit
lists that were published between 2014-06-09 and 2014-06-11.

The following other IP addresses of Tor relays in the same /24 network
were found in relay and/or exit lists around the time that could be
related to IP address 37.130.227.132:

37.130.227.133
37.130.227.134
"""

Now, I can't say whether users would expect that or do anything with
those "nearby addresses".  I found it possibly useful when I wrote the
service.  But I'm not at all a usability expert. ;)

All the best,
Karsten

___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Python ExoneraTor

2014-06-10 Thread Damian Johnson
> ... including searches for other IPv4 addresses in the same
> /24 and other IPv6 addresses in the same /48.

Ahhh. *That* would indeed make this a lot more of a pita. ExoneraTor
gives no indication that it accepts /24 or /48 ranges. Is that
capability even used by visitors?
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Proposal for improving social incentives for relay operators

2014-06-10 Thread Karsten Loesing
[Attempting to move this discussion to tor-dev@ to avoid cross-posting;
assuming my Reply-To: header won't get eaten by Mailman..]

On 10/06/14 02:26, Virgil Griffith wrote:
> For a while I've been seeking to grow the Tor network in both size and
> goodput.  Towards this end, I've explored various avenues such as
> increasing user-awareness via tor2web.  More recently, I've been exploring
> financial incentives like TorCoin.
> 
> Not wanting to strictly limit ourselves to financial incentives, I began
> reading the literature on incentivizing volunteers.  The most relevant
> papers I found are:
> 
> * http://www-2.rotman.utoronto.ca/facbios/file/LMS2_ManSci-Paper-Final.pdf
>  * http://pareto.uab.es/~prey/gneezy_254.pdf
> * https://dl.dropboxusercontent.com/u/3308162/Slonim%202013.pdf

(The last link is a 404.)

> The most relevant of these papers (Lacetera 2013) cites the major
> motivations for volunteer labor are: "pure altruism, warm glow, self-image,
> and reputation".  Upon reading this I realized TorCoin's technical
> interestingness had blinded me to much easier to leverage motivations of
> "warm glow" and "reputation".
> 
> I propose the following system for harnessing "warm glow" and "reputation"
> for Tor relay operators.  I am willing to fund this in its entirety.
> 
> I propose establishing a subdomain on torproject.org giving each Tor relay
> operator (hereafter affectionately called "Torati") his/her own page using
> the information her machines provide to the Tor Directory Consensus.  The
> fields to show on her "Torati profile page" would be things like:
> ContactInfo, PGP fingerprint, list of server nicknames, date the Directory
> Authorities first saw her contact info, etc.  You can also imagine a
> receiving special "special recognition stars" for operating an exit or
> bridge node.  Moreover, some bandwidth measurement like EigenSpeed or
> TorCoin gain traction, the Torati page could recognize contributors with by
> listing the sum total she has relayed to the Tor network.
> 
> Naturally a node can opt-out of Torati recognition by setting a parameter
> in the torrc file.
> 
> I argue this would be a cheap and easy way to motivate operators to
> volunteer more bandwidth for the Tor network.  As mentioned before, I am
> willing to fund this in its entirety.

Hi Virgil,

adding more/better incentives for people to run relays and bridges
sounds like a great plan!

What you describe sounds related to what I suggested last December on
this list:

https://lists.torproject.org/pipermail/tor-dev/2013-December/005948.html

> 9. Provide relay comparison metrics in Onionoo.  We could define some
> simple metrics on the usefulness of a relay, like provided bandwidth or
> uptime, in comparison to other relays.  A possible statement from these
> metrics could be: "your relay provides more bandwidth than 95% of relays
> in the network."  Similar to 8.  If Atlas [6] or Globe [8] or a
> yet-to-be-written Facebook application or a also-yet-to-be-written
> Twitter integration into Tor Weather (#10372) tell the world how
> successful someone's running Tor relays, maybe that encourages others to
> run relays, too.  We could even invent a points system for running
> relays, with additional points for running exits, if that makes the Tor
> network better.  Probably needs input from a community coordinator
> person.  (Orange part in the diagram.)
>
> [6] https://atlas.torproject.org/
> [8] https://globe.torproject.org/

Want to take a look at Onionoo and see whether it already provides the
information and functionality you need, and if not, open tickets for the
missing pieces?

https://onionoo.torproject.org/

But let me also give you some quick feedback on your proposal:

 - Why not make it entirely opt-in?  We could include a subscription
link in Weather's welcome messages that relay operators receive when
their relay first receives the Stable flag.

 - Where does the name "Torati" originate from?

All the best,
Karsten

___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Python ExoneraTor

2014-06-10 Thread Karsten Loesing
On 10/06/14 05:41, Damian Johnson wrote:
>>> let me make one remark about optimizing Postgres defaults: I wrote quite
>>> a few database queries in the past, and some of them perform horribly
>>> (relay search) whereas others perform really well (ExoneraTor).  I
>>> believe that the majority of performance gains can be achieved by
>>> designing good tables, indexes, and queries.  Only as a last resort we
>>> should consider optimizing the Postgres defaults.
>>>
>>> You realize that a searchable descriptor archives focuses much more on
>>> database optimization than the ExoneraTor rewrite from Java to Python
>>> (which would leave the database untouched)?
>>
>> Are other datastore models such as splunk or MongoDB useful?
>> [splunk has a free yet proprietary limited binary... those having
>> historical woes and takebacks, mentioned just for example here.]
> 
> Earlier I mentioned the idea of Dynamo. Unless I'm mistaken this lends
> itself pretty naturally to addresses as a hash key, and descriptor
> dates as the range key. Lookups would then be O(log(n)) where n is the
> total number of descriptors an address has published (... that is to
> say very, very quick).
> 
> This would be a fun project to give Boto a try. *sigh*... there really
> should be more hours in the day...

Quoting my reply to Damian to a similar question earlier in the thread:

> I'm wary about moving to another database, especially NoSQL ones and/or 
> cloud-based ones.  They don't magically make things faster, and Postgres is 
> something I understand quite well by now. [...] Not saying that DymanoDB 
> can't be the better choice, but switching the database is not a priority for 
> me.

If somebody wants to give, say, MongoDB a try, I'd be interested in
seeing the performance comparison to the current Postgres schema.  When
you do, please consider all three search_* functions that the current
schema offers, including searches for other IPv4 addresses in the same
/24 and other IPv6 addresses in the same /48.

All the best,
Karsten

___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev