Re: [fonc] Sorting the WWW mess

2012-03-02 Thread Martin Baldan
Julian,

I'm not sure I understand your proposal, but I do think what Google
does is not something trivial, straightforward or easy to automate. I
remember reading an article about Google's ranking strategy. IIRC,
they use the patterns of mutual linking between websites. So far, so
good. But then, when Google became popular, some companies started to
build link farms, to make themselves look more important to Google.
When Google finds out about this behavior, they kick the company to
the bottom of the index. I'm sure they have many secret automated
schemes to do this kind of thing, but it's essentially an arms race,
and it takes constant human attention. Local search is much less
problematic, but still you can end up with a huge pile of unstructured
data, or a huge bowl of linked spaghetti mess, so it may well make
sense to ask a third party for help to sort it out.

I don't think there's anything architecturally centralized about using
Google as a search engine, it's just a matter of popularity. You also
have Bing, Duckduckgo, whatever.

 On the other hand, data storage and bandwidth are very centralized.
Dropbox, Google docs, iCloud, are all sympthoms of the fact that PC
operating systems were designed for local storage. I've been looking
at possible alternatives. There's distributed fault-tolerant network
filesystems like Xtreemfs (and even the Linux-based XtreemOS), or
Tahoe-LAFS (with object-capabilities!), or maybe a more P2P approach
such as Tribler (a tracker-free bittorrent), and for shared bandwidth
apparently there is a BittorrentLive (P2P streaming). But I don't know
how to put all that together into a usable computing experience. For
instance, squeak is a single file image, so I guess it can't benefit
from file-based capabilities, except if the objects were mapped to
files in some way. Oh, well, this is for another thread.


-Best

 Martin

On Fri, Mar 2, 2012 at 6:54 AM, Julian Leviston jul...@leviston.net wrote:
 Right you are. Centralised search seems a bit silly to me.

 Take object orientedism and apply it to search and you get a thing where
 each node searches itself when asked...  apply this to a local-focussed
 topology (ie spider web serch out) and utilise intelligent caching (so
 search the localised caches first) and you get a better thing, no?

 Why not do it like that? Or am I limited in my thinking about this?

 Julian

 On 02/03/2012, at 4:26 AM, David Barbour wrote:

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-02 Thread BGB

On 3/2/2012 8:37 AM, Martin Baldan wrote:

Julian,

I'm not sure I understand your proposal, but I do think what Google
does is not something trivial, straightforward or easy to automate. I
remember reading an article about Google's ranking strategy. IIRC,
they use the patterns of mutual linking between websites. So far, so
good. But then, when Google became popular, some companies started to
build link farms, to make themselves look more important to Google.
When Google finds out about this behavior, they kick the company to
the bottom of the index. I'm sure they have many secret automated
schemes to do this kind of thing, but it's essentially an arms race,
and it takes constant human attention. Local search is much less
problematic, but still you can end up with a huge pile of unstructured
data, or a huge bowl of linked spaghetti mess, so it may well make
sense to ask a third party for help to sort it out.

I don't think there's anything architecturally centralized about using
Google as a search engine, it's just a matter of popularity. You also
have Bing, Duckduckgo, whatever.


yeah.

the main thing Google does is scavenging and aggregating data.
and, they have done fairly well at it...

and they make money mostly via ads...



  On the other hand, data storage and bandwidth are very centralized.
Dropbox, Google docs, iCloud, are all sympthoms of the fact that PC
operating systems were designed for local storage. I've been looking
at possible alternatives. There's distributed fault-tolerant network
filesystems like Xtreemfs (and even the Linux-based XtreemOS), or
Tahoe-LAFS (with object-capabilities!), or maybe a more P2P approach
such as Tribler (a tracker-free bittorrent), and for shared bandwidth
apparently there is a BittorrentLive (P2P streaming). But I don't know
how to put all that together into a usable computing experience. For
instance, squeak is a single file image, so I guess it can't benefit
from file-based capabilities, except if the objects were mapped to
files in some way. Oh, well, this is for another thread.


agreed.

just because I might want to have better internet file-systems, doesn't 
necessarily mean I want all my data to be off on someones' server somewhere.


much more preferable would be if I could remotely access data stored on 
my own computer.


the problem is that neither OS's nor networking hardware were really 
designed for this:
broadband routers tend to assume by default that the network is being 
used purely for pulling content off the internet, ...


at this point, it means convenience either requires some sort of central 
server to pull data from, or bouncing off of such a server (sort of like 
some sort of Reverse FTP, the computer holding the data connects to a 
server, and in turn makes its data visible on said server, and other 
computers connect to the server to access data stored on their PC, 
probably with some file-proxy magic and mirroring and similar...).


technically, the above could be like a more organized version of a P2P 
file-sharing system, and could instead focus more on sharing for 
individuals (between their devices) or between groups. unlike with a 
central server, it allows for much more storage space (one can easily 
have TB of shared space, rather than worrying about several GB or 
similar on some server somewhere).


nicer would be if it could offer a higher-performance alternative to a 
Mercurial or GIT or similar style system or similar (rather than simply 
being a raw shared filesystem).



better though would be if broadband routers and DNS worked in a way 
which made it fairly trivial for pretty much any computer to be easily 
accessible remotely, without having to jerk off with port-forwarding and 
other things.



potentially, if/when the last mile internet migrates to IPv6, this 
could help (as then presumably both NAT and dynamic IP addresses can 
partly go away).


but, it is taking its time, and neither ISPs nor broadband routers seem 
to yet support IPv6...





-Best

  Martin

On Fri, Mar 2, 2012 at 6:54 AM, Julian Levistonjul...@leviston.net  wrote:

Right you are. Centralised search seems a bit silly to me.

Take object orientedism and apply it to search and you get a thing where
each node searches itself when asked...  apply this to a local-focussed
topology (ie spider web serch out) and utilise intelligent caching (so
search the localised caches first) and you get a better thing, no?

Why not do it like that? Or am I limited in my thinking about this?

Julian

On 02/03/2012, at 4:26 AM, David Barbour wrote:


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Martin Baldan
Loup,

I agree that the Web is a mess. The original sin was to assume that people
would only want to connect to other computers in order to retrieve a
limited set of static documents. I think the reason for this was that
everyone sticked to the Unix security model, where everything you run has
all the permissions you have. That's why you don't want to run code from
untrusted sources. If they had used a capablity-based security model from
the start, this concern would probably not have arised.

Also, a deeper culprit, in my opinion, is Intellectual Property. There were
several great networking protocols before the internet, but they were
usually proprietary protocols for proprietary operatinog systems. Don't
forget that, for instance, Plan9 was not open sourced until 2000 or 2002.
Now there's a lot of talk of open standards, but there was a time when the
main source of open standards were half-baked government projects. The main
reason why the IBM PC architecture dominates is that Compaq managed to
clone it legally. The main reason why Microsoft operating systems got to
dominate is that they were ready from the start to run on those cheap and
widespread IBM PC clones, both technically and legally.

 I also think that the internet, with its silly limited IP numbers and DNS
servers smack of premature optimization. I mean, configuring a network
feels a bit like programming in machine code. There's also the issue of
one-way links, which creates the need for complex feedback mechanisms such
as RSS, moreover, the fact that regular URLs are so ephemeral, which gave
rise to permalinks. Then again, if it were all based on two-way links,
maybe we would need a complex system for transparent anonymous linking,
some kind of virtual link.

That said, I don't see why you have an issue with search engines and search
services. Even on your own machine, searching files with complex properties
is far from trivial. When outside, untrusted sources are involved, you need
someone to tell you what is relevant, what is not, who is lying, and so on.
Google got to dominate that niche for the right reasons, namely, being much
better than the competition.

Best,

 -Martin
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Loup Vaillant

Martin Baldan wrote:

That said, I don't see why you have an issue with search engines and
search services. Even on your own machine, searching files with complex
properties is far from trivial. When outside, untrusted sources are
involved, you need someone to tell you what is relevant, what is not,
who is lying, and so on. Google got to dominate that niche for the right
reasons, namely, being much better than the competition.


I wasn't clear.  Actually, I didn't want to state my opinion.  I can't
find the message, but I (incorrectly?) remembered Alan saying that
one-way links basically created the need for big search engines.  As I
couldn't imagine an architecture that could do away with centralized
search engines, I wanted to ask about it.

That said, I do have issues with Big Data search engines: they are
centralized.  That alone gives them more power than I'd like them to
have.  If we could remove the centralization while keeping the good
stuff (namely, finding things), that would be really cool.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Alan Kay
Hi Loup

Someone else said that about links.

Browsing about either knowing where you are (and going) and/or about dealing 
with a rough max of 100 items. After that search is necessary.

However, Ted Nelson said a lot in each of the last 5 decades about what kinds 
of linking do the most good. (Chase down what he has to say about why one-way 
links are not what should be done.) He advocated from the beginning that the 
provenance of links must be preserved (which also means that you cannot copy 
what is being pointed to without also copying its provenance). This allows a 
much better way to deal with all manner of usage, embeddings, etc. -- including 
both fair use and also various forms of micropayments and subscriptions.

One way to handle this requirement is via protection mechanisms that real 
objects can supply.

Cheers,

Alan





 From: Loup Vaillant l...@loup-vaillant.fr
To: fonc@vpri.org 
Sent: Thursday, March 1, 2012 6:36 AM
Subject: Re: [fonc] Sorting the WWW mess
 
Martin Baldan wrote:
 That said, I don't see why you have an issue with search engines and
 search services. Even on your own machine, searching files with complex
 properties is far from trivial. When outside, untrusted sources are
 involved, you need someone to tell you what is relevant, what is not,
 who is lying, and so on. Google got to dominate that niche for the right
 reasons, namely, being much better than the competition.

I wasn't clear.  Actually, I didn't want to state my opinion.  I can't
find the message, but I (incorrectly?) remembered Alan saying that
one-way links basically created the need for big search engines.  As I
couldn't imagine an architecture that could do away with centralized
search engines, I wanted to ask about it.

That said, I do have issues with Big Data search engines: they are
centralized.  That alone gives them more power than I'd like them to
have.  If we could remove the centralization while keeping the good
stuff (namely, finding things), that would be really cool.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread David Barbour
On Thu, Mar 1, 2012 at 7:08 AM, Martin Baldan martino...@gmail.com wrote:

 I think it was Julian, in message:

 http://vpri.org/mailman/private/fonc/2012/003131.html

 BTW, I'm having a hard time trying to find who said what in this mailing
 list. Maybe I'm missing something, I feel  a bit silly, but here's the
 problem:


 _ Apparently, Google can't search this mailing list, I guess it's because
 of its private nature. For instance, the query:

 google site:http://vpri.org/mailman/private/fonc/2012/thread.html

 shields no results.


 _ I can search e-mails for keywords in my Gmail account, but when I find
 one, I don't know what message number it is. I only see the date and time.

 _ The mailing list web interface lets me arrange messages by date, but it
 doesn't show me the date of each message in a column.

 So what should I do?


http://www.mail-archive.com/fonc@vpri.org/




 As for centralization, I don't think you can avoid some degree of natural
 centralization of trust. For instance, I tend to trust the VPRI people when
 it comes to programming-related theory and ideas. Am I giving them too much
 power? ;)

 What should be avoided is single points of failure in infrastructure. I
 should be able to decide whom to trust, without artificial limits imposed
 by the technology.

 Best,

  -Martin


 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Martin Baldan
Ah, thanks! :)

On Thu, Mar 1, 2012 at 6:26 PM, David Barbour dmbarb...@gmail.com wrote:



 http://www.mail-archive.com/fonc@vpri.org/
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Casey Ransberger
On Thu, Mar 1, 2012 at 7:04 AM, Alan Kay alan.n...@yahoo.com wrote:

 Hi Loup

 snip



 However, Ted Nelson said a lot in each of the last 5 decades about what
 kinds of linking do the most good. (Chase down what he has to say about why
 one-way links are not what should be done.) He advocated from the beginning
 that the provenance of links must be preserved (which also means that you
 cannot copy what is being pointed to without also copying its provenance).
 This allows a much better way to deal with all manner of usage, embeddings,
 etc. -- including both fair use and also various forms of micropayments and
 subscriptions.


If only we could find a way to finally deal with all that intertwingularity!


 One way to handle this requirement is via protection mechanisms that real
 objects can supply.

 Cheers,

 Alan

   --
 *From:* Loup Vaillant l...@loup-vaillant.fr
 *To:* fonc@vpri.org
 *Sent:* Thursday, March 1, 2012 6:36 AM
 *Subject:* Re: [fonc] Sorting the WWW mess

 Martin Baldan wrote:
  That said, I don't see why you have an issue with search engines and
  search services. Even on your own machine, searching files with complex
  properties is far from trivial. When outside, untrusted sources are
  involved, you need someone to tell you what is relevant, what is not,
  who is lying, and so on. Google got to dominate that niche for the right
  reasons, namely, being much better than the competition.

 I wasn't clear.  Actually, I didn't want to state my opinion.  I can't
 find the message, but I (incorrectly?) remembered Alan saying that
 one-way links basically created the need for big search engines.  As I
 couldn't imagine an architecture that could do away with centralized
 search engines, I wanted to ask about it.

 That said, I do have issues with Big Data search engines: they are
 centralized.  That alone gives them more power than I'd like them to
 have.  If we could remove the centralization while keeping the good
 stuff (namely, finding things), that would be really cool.

 Loup.
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc



 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc




-- 
Casey Ransberger
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Max Orhai
Nelson's still kicking, you know: see http://gzigzag.sourceforge.net/ for
some recent spin-offs.

-- Max

On Thu, Mar 1, 2012 at 2:56 PM, Casey Ransberger
casey.obrie...@gmail.comwrote:



 On Thu, Mar 1, 2012 at 7:04 AM, Alan Kay alan.n...@yahoo.com wrote:

 Hi Loup

 snip



 However, Ted Nelson said a lot in each of the last 5 decades about what
 kinds of linking do the most good. (Chase down what he has to say about why
 one-way links are not what should be done.) He advocated from the beginning
 that the provenance of links must be preserved (which also means that you
 cannot copy what is being pointed to without also copying its provenance).
 This allows a much better way to deal with all manner of usage, embeddings,
 etc. -- including both fair use and also various forms of micropayments and
 subscriptions.


 If only we could find a way to finally deal with all that
 intertwingularity!


 One way to handle this requirement is via protection mechanisms that
 real objects can supply.

 Cheers,

 Alan

   --
 *From:* Loup Vaillant l...@loup-vaillant.fr
 *To:* fonc@vpri.org
 *Sent:* Thursday, March 1, 2012 6:36 AM
 *Subject:* Re: [fonc] Sorting the WWW mess

 Martin Baldan wrote:
  That said, I don't see why you have an issue with search engines and
  search services. Even on your own machine, searching files with complex
  properties is far from trivial. When outside, untrusted sources are
  involved, you need someone to tell you what is relevant, what is not,
  who is lying, and so on. Google got to dominate that niche for the right
  reasons, namely, being much better than the competition.

 I wasn't clear.  Actually, I didn't want to state my opinion.  I can't
 find the message, but I (incorrectly?) remembered Alan saying that
 one-way links basically created the need for big search engines.  As I
 couldn't imagine an architecture that could do away with centralized
 search engines, I wanted to ask about it.

 That said, I do have issues with Big Data search engines: they are
 centralized.  That alone gives them more power than I'd like them to
 have.  If we could remove the centralization while keeping the good
 stuff (namely, finding things), that would be really cool.

 Loup.
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc



 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc




 --
 Casey Ransberger

 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc


___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Sorting the WWW mess

2012-03-01 Thread Julian Leviston
Right you are. Centralised search seems a bit silly to me.

Take object orientedism and apply it to search and you get a thing where each 
node searches itself when asked...  apply this to a local-focussed topology (ie 
spider web serch out) and utilise intelligent caching (so search the localised 
caches first) and you get a better thing, no?

Why not do it like that? Or am I limited in my thinking about this?

Julian

On 02/03/2012, at 4:26 AM, David Barbour wrote:

 
 
 On Thu, Mar 1, 2012 at 7:08 AM, Martin Baldan martino...@gmail.com wrote:
 I think it was Julian, in message:
 
 http://vpri.org/mailman/private/fonc/2012/003131.html
 
 BTW, I'm having a hard time trying to find who said what in this mailing 
 list. Maybe I'm missing something, I feel  a bit silly, but here's the 
 problem:
 
 _ Apparently, Google can't search this mailing list, I guess it's because of 
 its private nature. For instance, the query:
 
 google site:http://vpri.org/mailman/private/fonc/2012/thread.html
 
 shields no results.
 
 
 _ I can search e-mails for keywords in my Gmail account, but when I find one, 
 I don't know what message number it is. I only see the date and time.
 
 _ The mailing list web interface lets me arrange messages by date, but it 
 doesn't show me the date of each message in a column.
 
 So what should I do?
 
 http://www.mail-archive.com/fonc@vpri.org/
 
  
 
 As for centralization, I don't think you can avoid some degree of natural 
 centralization of trust. For instance, I tend to trust the VPRI people when 
 it comes to programming-related theory and ideas. Am I giving them too much 
 power? ;)
 
 What should be avoided is single points of failure in infrastructure. I 
 should be able to decide whom to trust, without artificial limits imposed by 
 the technology.
 
 Best,
 
  -Martin
 
 
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc
 
 
 ___
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/listinfo/fonc

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc