Re: [OSM-dev] using Osmium to filter osh files

2014-05-23 Thread Jochen Topf
Hi!

Wow. That's a lot. I suggest you start with something more simple like the
amenity tags on nodes and then work your way up. Everything that involves
nodes and ways together, you have to read the input file twice and/or store
data in between and that makes it more complicated. This is especially
complex for history files.

Concering Osmium: The new Osmium and its documentation is work in progress,
it will take a while for all of that to appear. I fear you'll have to make
do with whats there. The handler concept is quite similar though to what
the old osmium was doing and there are some links to talks and blog posts
on http://wiki.openstreetmap.org/wiki/Osmium that you can read to get a
better idea about the general architecture. Basically what happens is that
a file is read and a callback on each of the handlers is called in turn to
work on the objects as they are read from the file.

Jochen

On Do, Mai 22, 2014 at 11:53:37 -0400, Abhishek wrote:
 Date: Thu, 22 May 2014 11:53:37 -0400
 From: Abhishek dalek2poi...@gmail.com
 To: Jochen Topf joc...@remote.org
 Cc: dev@openstreetmap.org
 Subject: Re: [OSM-dev] using Osmium to filter osh files
 
 definitely.
 
 I'm looking to analyze the development of OpenStreetMap in the US with
 a particular focus on contributor activity -- and trying to understand
 differences between number of contributors entering in different
 regions over time and also the *nature* of their contribution activity
 (adding new streets, amenities, natural features vs. adding tags,
 fixing existing features etc).
 
 As a first pass, I've already used the changeset files (using the
 middle of the BBOX as an approximate measure of location) to
 understand user contribution activity. What the changeset files do not
 allow me to do is understand the *nature* of the contributions. In
 order to do this, I'm looking at the history files.
 
 As a first pass, I would like to build a flatfile (CSV-like) that is
 edit-level (rather than changeset level) and understand what each
 edit meant. In particular, I'm interested in classifying edits into
 the following categories
 
 (a) adding new amenity (record name, location of amenity)
 (b) adding new street (record name of street and approximate location,
 i.e. midpoint of the way)
 (c) adding tags to existing street (which tags? maxspeed and oneway
 are interesting)
 (d) deleting features
 (e) other (notably adding natural features etc)
 
 So, specifically, one idea might be to have a dataset that records
 every node added, its location, metadata (user, timestamp etc) and its
 tags and for every way, reduce it to a point (like osmconvert's
 all-to-node) and do the same. I'm also open to other suggestions.
 
 The algorithm might work as follows:
 
 1. go through every node in the osh file and write it to a csv only if
 it does not belong to a way (this will capture point features)

A node can be part of a way and a point feature at the same time.

 2. go through every way, reduce the way to a single point, write the
 point feature and related metadata to a csv file
 3. ignore relations.
 
 So this would be something like osmconvert with the options
 all-to-nodes and drop-relations
 
 Any ideas on how should I go about doing this? In terms of the
 documentation, I've been using the new osmium and looking at
 osmcode.org, but my sense is that this documentation is not yet
 complete (for example I cannot find the tag filter classes that you
 mention) -- are these documented on the Wiki?
 
 Its fun to be using a low-level tool like Osmium, but any help would
 make this process a lot easier for me. Thanks!
 
 Abhishek
 
 On Thu, May 22, 2014 at 8:40 AM, Jochen Topf joc...@remote.org wrote:
  On Mi, Mai 21, 2014 at 11:20:18 -0400, Abhishek wrote:
  I would like to use osmium to filter .osh files. Specifically I wanted
  to recreate the features of osmfilter, that allows me extract certain
  features like amenity=* or highway=* along with their relevant
  histories from a .osh.pbf file.
 
  I've managed to successsfully setup osmium and osmium-tool, but I
  couldnt figure out a way to use these tools to filter features from
  the history data. I'm very new to writing code in C++, so I was hoping
  this feature was implemented. Any ideas on where I should be looking
  for help?
 
  Working with the history files is not easy and it very much depends on what 
  you
  really want to do. In the general case, it is not enough to find, for 
  instance,
  all ways tagged with highway=*, you have to find the nodes that were used by
  those ways at the time when those ways were current. If you are only 
  interested
  in the tags and their history and not the location of those ways, it becomes
  much easier. So first, you have to understand the details of the OSM data 
  model
  and how it plays out in the history files.
 
  Osmium has many building blocks that you will need, it can read the history
  files, there are tag filter classes (osmium::tags::KeyFilter and
 

Re: [OSM-dev] osm2pgsql planet_osm_ways sudden shrinking :(

2014-05-23 Thread Simon Poole
Doesn't seem to be a general issue

http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/postgres_size_gis_9_1_main.html

my server

http://hz3.poole.ch/munin/localdomain/localhost.localdomain/postgres_size_gis.html

Simon


Am 23.05.2014 10:47, schrieb Christian Quest:
 On 2 different OSMFR tile servers we recently got the same problem:
 planet_osm_ways is suddenly shrinked. We have no idea of what can
 cause this.
 It looks like the whole table is truncated.
 
 You can have a look for exemple here:
 http://munin.openstreetmap.fr/free.org/osm13.openstreetmap.fr/postgres_size_ALL.html
 http://munin.openstreetmap.fr/osm12.free.org/osm105.openstreetmap.fr/postgres_size_ALL.html
 
 Are we the only ones to face this strange problem ?
 
 -- 
 Christian Quest - OpenStreetMap France
 
 
 ___
 dev mailing list
 dev@openstreetmap.org
 https://lists.openstreetmap.org/listinfo/dev
 



signature.asc
Description: OpenPGP digital signature
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] OSM software repositories -- git and svn

2014-05-23 Thread Frederik Ramm
Hi,

   this is a discussion/brainstorming about if and how to get rid of
what remains of the OSM project SVN.

Let me first say what I liked about SVN:

* I liked that there was a project-wide SVN without any privilege
hierarchies. Everyone could ask for an account, and then they could
commit changes to everything.

* I liked that there was one canonical version of everything in the SVN,
and that I could change the canonical version instantly, instead of
making my own copy of it and then asking someone to somehow accept my
change.

* I liked that I could disown stuff into SVN - here's a small script I
wrote, I don't intend to maintain it really, but feel free to
use/improve it and that others could then contribute to something like
that without actually having to take ownership, and without users having
to track whose version was currently the one to use.

* I liked that things were discoverable - if someone made an obscure
script that deals with .poly files then it was very likely that it would
be found in the appropriate SVN directory, rather than under a fancy
repo name somewhere in github,

* I liked that it was ours, and not dependent on the goodwill or
cooperation of a third party operator.

What I picture here is a, sometimes precarious, version of collective
ownership. It is not suitable for large, highly visible project with a
solid contributor base - for example, JOSM has always been developed
outside of the project SVN even while that was still in vogue.

One of the main criticisms of our SVN is that it is full of cruft that
doesn't even work anymore, and while there are plenty github repos that
share the same fate, this laissez-faire attitude to project ownership
might well contribute to that. I would be more likely to clean something
up if it was a repository listed under my name in github that something
that I dumped into SVN five years ago and have since forgotten.

Still I maintain that there is, conceptually, a niche for this shared
ownership, where the developer community of the whole project has
instant write access to the canonical version of things.

Myself, I've written countless little scripts and snippets that are
shared in SVN. Sometimes people have indeed made their own changes to
them; sometimes my scripts simply live alongside 5 other scripts that
are thematically related (e.g. a collection of utilities that all deal
with .poly files in one way or the other).

The generic approach to something like that today would probably be that
every author creates a github repo for his stuff, potentially granting
others access if they ask. Git and github are simply the way to go these
days, and while some people still shed tears for the good old SVN (or
CVS, or... what was it before that, RCS?) times, it doesn't really make
much sense to have many different technologies for what is essentially
the same basic task of version control.

Assuming for a moment that we would like to drop our SVN altogether and
replace it by git/github - I'd like to discuss the following:

1. collective ownership

Can we maybe have the (fsvo) superior technology offered by git and
still not completely drop the collective ownership idea? Can we somehow
use git(hub) (without totally ab-using it) to create a niche for stuff
that is not quite a standalone project and not necessarily owned by
one single person? Or am I maybe the only person in the world who sees
some value in that concept?

Should everyone who remotely considers themselves an OSM developer have
write access to the openstreetmap repository in github, or should we
create an openstreetmap-developer repository for that which would have
a less official character? Is there maybe a technical way to grant
write access to the repository to everyone who has an OSM account
without extra signup?

Can we continue to support users who, for privacy reasons, don't want to
work through the github platform but who would rather only communicate
with a server directly under our control?

2. discoverability

I know that there are people who create a github repository for
everything, even if it's just a 30-line text file to maintain. Doing
that for all the conceptually different things that we now have in our
SVN would probably yield something between 200 and 500 repositories,
and it would be (correct me if I am wrong please) a big step backwards
in discoverability because git(hub) repositories cannot be arranged in
trees - in SVN I can, for example, go to applications/rendering or
applications/utils/export and do an ls there to see.

Would that mean that we'd essentially have to create one big repository
that can hold a ton of completely separate stuff like our SVN does
today? Or would we create hundreds of mini repos and then have a
separate index for them, e.g. a wiki page or a 101st repo? Maybe there's
some state of the art solution for this kind of problem?

3. moving across old stuff from SVN

If we can manage to find a way to give a new home to stuff from SV in
the 

Re: [OSM-dev] OSM software repositories -- git and svn

2014-05-23 Thread Paul Hartmann

Hi,

I just like to point out, that there is already a dedicated github account:

https://github.com/openstreetmap

It hosts for example iD, osm2pgsql, mod_tile and lots of mirrors.

Technically, openstreetmap is a github Organization. This means, not a 
single person owns the account, but a group of so called Owners 
(including tomhughes and Firefishy). Also you have very convenient 
interface to assign write and admin access to the members, for 
individual repositories and globally.


See also: 
https://github.com/openstreetmap/openstreetmap-mirror/blob/master/ABOUT.md


On 23.05.2014 15:19, Frederik Ramm wrote:

Myself, I've written countless little scripts and snippets that are
shared in SVN. Sometimes people have indeed made their own changes to
them; sometimes my scripts simply live alongside 5 other scripts that
are thematically related (e.g. a collection of utilities that all deal
with .poly files in one way or the other).
[...]
I know that there are people who create a github repository for
everything, even if it's just a 30-line text file to maintain. Doing
that for all the conceptually different things that we now have in our
SVN would probably yield something between 200 and 500 repositories


I would guess that it is more typical that people think in terms of 
named projects. So in most cases, a separate repository would be 
appropriate.



Or would we create hundreds of mini repos and then have a
separate index for them, e.g. a wiki page or a 101st repo?


That might be the best option.

Paul


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] OSM software repositories -- git and svn

2014-05-23 Thread Frederik Ramm
Hi,

On 05/23/2014 07:31 PM, Paul Hartmann wrote:
 I just like to point out, that there is already a dedicated github account:
 https://github.com/openstreetmap
 It hosts for example iD, osm2pgsql, mod_tile and lots of mirrors.

I was aware of that but I'm not sure if people would be happy for a
couple hundred people (or even any OSM contributor) to have write
access to the lot, and if swamping that with ...

 Or would we create hundreds of mini repos and then have a
 separate index for them, e.g. a wiki page or a 101st repo?
 
 That might be the best option.

... 100s of repos would be good. There's also the osmlab group account
on github which might be a bit more of a free-for-all than the
openstreetmap account (README says we are liberal with commit
rights). Possibly that one was created in order to not have to be too
liberal on the openstreetmap account ;)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] OSM software repositories -- git and svn

2014-05-23 Thread Jochen Topf
On Fr, Mai 23, 2014 at 03:19:07 +0200, Frederik Ramm wrote:
 Should everyone who remotely considers themselves an OSM developer have
 write access to the openstreetmap repository in github, or should we
 create an openstreetmap-developer repository for that which would have
 a less official character? Is there maybe a technical way to grant
 write access to the repository to everyone who has an OSM account
 without extra signup?

GitHub has an API. It should be pretty easy to create a mini web page
somewhere, where everybody can just type in their GitHub account name and the
web page will automatically add this account to the committer list for some
repository or so. You could combine this with an OAuth authentication to the
OSM server if you wanted.

An alternative would be to add some hooks that all pull requests to some
repository are always automatically merged. So people would still clone the
project and work on their own copy, but they could always send a pull request
to the master repository which would merge it automatically.  This would
even work without github.

You could also set up some fully automated system where everybody can send pull
requests, if you are on the white list it is accepted immediately, if you are
on the black list, it will be rejected, and for everybody else you'll land on
some list that people can review and if a reviewer clicks on okay, your pull
request goes through and you'll be added to the whitelist automatically. That
way there is a minimal hurdle, but if you have done something okay once, the
system will trust you in the future. Reviewers could be the same people that
are on the whitelist, you seed that with a few trusted people and then the
system will regulate itself (hopefully).

Of course there could be lots of other combinations. You can add code to make
sure tests run through before merging etc. The coding needed for these things
should be pretty minimal, just some scripts that are run from git or github
hooks.

The added benefit of doing that in git is that you don't need special accounts
like in SVN, because git basically creates accounts out of email addresses.
Well, if you use GitHub in there, people need github accounts. But you don't
have to have another account list that somebody has to administer like with
the current SVN setup. So no password changing hassles etc., thats all done
somewhere else.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.jochentopf.com/  +49-721-388298

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] OSM software repositories -- git and svn

2014-05-23 Thread Jochen Topf
On Fri, May 23, 2014 at 03:19:07PM +0200, Frederik Ramm wrote:
 2. discoverability
 
 I know that there are people who create a github repository for
 everything, even if it's just a 30-line text file to maintain. Doing
 that for all the conceptually different things that we now have in our
 SVN would probably yield something between 200 and 500 repositories,
 and it would be (correct me if I am wrong please) a big step backwards
 in discoverability because git(hub) repositories cannot be arranged in
 trees - in SVN I can, for example, go to applications/rendering or
 applications/utils/export and do an ls there to see.
 
 Would that mean that we'd essentially have to create one big repository
 that can hold a ton of completely separate stuff like our SVN does
 today? Or would we create hundreds of mini repos and then have a
 separate index for them, e.g. a wiki page or a 101st repo? Maybe there's
 some state of the art solution for this kind of problem?

Lots of little repositories and a 101st with repo with list of repo URLs
sound good to me. This would also allow different ownership/rights for
all of the little repos. Why a one-size-fits-all solution when some repos
could be free-for-all and some more managed. Just allow everybody to add
any repository to your 101st list repo.

If you want to, you could ask people to add some standardized meta.json
file or so that you can then crawl to build some kind of index to make
it even easier to find repos by keyword or whatever.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.jochentopf.com/  +49-721-388298

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev