Re: [GRASS-user] too many categories: buffer overflow
On 2019-06-20 at 09:37 +02, Moritz Lennert wrote... > Just a rapid, wild guess here, but would it be feasible to just > vectorize the three separately and then create the link afterwards > using v.distance ? This may work. An alternate method is if I do this through the Python GRASS interface, I can more easily create unique IDs that are not based on the simple "Cell #" algorithm I'm currently using. That is harder to do in bash. Or just recompile GRASS to use LONGs for primary key rather than INT. For now I'm sticking with a lower resolution 90 m raster. Problem solved. Thanks, -k. ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] too many categories: buffer overflow
On 18/06/19 21:51, Ken Mankoff wrote: Hi Micha and Markus, On 2019-06-18 at 10:07 -04, Micha Silver wrote... Do you really want a vector polygon map with > 2 billion features? No, and there are not that many. % r.info -r basins min=-2147474681 max=2147429730 But I don't have categories from 1 to 2147429730. The values are sparse. I describe my workflow and why I've created these sparse values in more detail below. Even though << 2 billion, there should be many basins. This is all of Greenland at 30 m resolution, which is 4.5 billion features. Taking a step back, I'm trying to generate unique basin values that match the stream and outlet CAT values. Here is my workflow which doesn't appear to have any problems when run at 90x90 m resolution (400 million cells) but fails at 30x30 m resolution (10x as many, or 4.5 billion cells). 1) Find streams: r.stream.extract elevation=head threshold=${THRESH} memory=16384 direction=dir stream_raster=streams stream_vector=streams 2) Find outlets. Where streams have outlets, use the same CAT value so the two can be linked in further analysis. But many outlets don't have streams. These need to have unique categories for the next step when we find basins. This is where my error is. I set the unique value to the cell #, which is > 2 billion when using a 30x30 m domain. r.mapcalc "outlets_all = if(dir < 0, 1, null())" r.mapcalc "outlets_streams_1 = if((dir < 0) && (not(isnull(streams))), streams, outlets_all)" ### BUG INTRODUCED HERE, setting (eventual) cat to cell number: r.mapcalc "outlets_streams = if(outlets_streams_1 != 1, outlets_streams_1, max(outlets_streams_1)+1+col()+(max(col())*(row()-1)))" # convert outlets to a vector. r.out.xyz input=outlets_streams | \ v.in.ascii input=- output=outlets_streams separator=pipe \ columns="x int, y int, cat int" x=1 y=2 cat=3 Q: How can I create the outlets_streams vector for all locations where dir < 0 (all outlets), that maintains the same value as the streams raster where that raster is defined, but unique values at all other locations where streams is not defined, but dir < 0? 3) Find basins r.stream.basins -m direction=dir points=outlets_streams basins=basins_all memory=16384 --verbose 4) Absorb small basins r.clump -d input=basins_all output=basins_nosmall minsize=124 r.mode base=basins_nosmall cover=basins_all output=basins ### BUG APPEARS HERE r.to.vect -v input=basins output=basins type=area # drop outlets for absorbed basins. r.mapcalc "outlets = if(outlets_streams == basins, basins, null())" r.to.vect -v input=outlets output=outlets type=point NOTE: I use r.mode instead of r.area because I need to maintain the category value, so that eventual vectors can have linked primary keys. r.area re-assigns categories. Any advice how to generate streams, outlets, and basins all with linked primary key would be much appreciated. Just a rapid, wild guess here, but would it be feasible to just vectorize the three separately and then create the link afterwards using v.distance ? Moritz ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] too many categories: buffer overflow
Hi Micha and Markus, On 2019-06-18 at 10:07 -04, Micha Silver wrote... > Do you really want a vector polygon map with > 2 billion features? No, and there are not that many. % r.info -r basins min=-2147474681 max=2147429730 But I don't have categories from 1 to 2147429730. The values are sparse. I describe my workflow and why I've created these sparse values in more detail below. Even though << 2 billion, there should be many basins. This is all of Greenland at 30 m resolution, which is 4.5 billion features. Taking a step back, I'm trying to generate unique basin values that match the stream and outlet CAT values. Here is my workflow which doesn't appear to have any problems when run at 90x90 m resolution (400 million cells) but fails at 30x30 m resolution (10x as many, or 4.5 billion cells). 1) Find streams: r.stream.extract elevation=head threshold=${THRESH} memory=16384 direction=dir stream_raster=streams stream_vector=streams 2) Find outlets. Where streams have outlets, use the same CAT value so the two can be linked in further analysis. But many outlets don't have streams. These need to have unique categories for the next step when we find basins. This is where my error is. I set the unique value to the cell #, which is > 2 billion when using a 30x30 m domain. r.mapcalc "outlets_all = if(dir < 0, 1, null())" r.mapcalc "outlets_streams_1 = if((dir < 0) && (not(isnull(streams))), streams, outlets_all)" ### BUG INTRODUCED HERE, setting (eventual) cat to cell number: r.mapcalc "outlets_streams = if(outlets_streams_1 != 1, outlets_streams_1, max(outlets_streams_1)+1+col()+(max(col())*(row()-1)))" # convert outlets to a vector. r.out.xyz input=outlets_streams | \ v.in.ascii input=- output=outlets_streams separator=pipe \ columns="x int, y int, cat int" x=1 y=2 cat=3 Q: How can I create the outlets_streams vector for all locations where dir < 0 (all outlets), that maintains the same value as the streams raster where that raster is defined, but unique values at all other locations where streams is not defined, but dir < 0? 3) Find basins r.stream.basins -m direction=dir points=outlets_streams basins=basins_all memory=16384 --verbose 4) Absorb small basins r.clump -d input=basins_all output=basins_nosmall minsize=124 r.mode base=basins_nosmall cover=basins_all output=basins ### BUG APPEARS HERE r.to.vect -v input=basins output=basins type=area # drop outlets for absorbed basins. r.mapcalc "outlets = if(outlets_streams == basins, basins, null())" r.to.vect -v input=outlets output=outlets type=point NOTE: I use r.mode instead of r.area because I need to maintain the category value, so that eventual vectors can have linked primary keys. r.area re-assigns categories. Any advice how to generate streams, outlets, and basins all with linked primary key would be much appreciated. Thanks, -k. ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] too many categories: buffer overflow
On Tue, Jun 18, 2019 at 4:08 PM Micha Silver wrote: > > It looks like you've run out of integer values for the "category" primary key. > > Do you really want a vector polygon map with > 2 billion features? > > > On 18/06/2019 15:14, Ken Mankoff wrote: > > Hello, > > I think I'm experiencing a buffer overflow. This is a hard one to search for with GRASS GIS because the word "buffer" and "overflow" appear throughout as in r.buffer and overflowing weirs, etc. but I'm referring to the C-code error type of buffer oveflow: > > r.to.vect -v input=basins output=basins type=area > > DBMI-SQLite driver error: > Error in sqlite3_step(): > UNIQUE constraint failed: basins.cat > > ERROR: Unable to insert into table: insert into basins values ( >-2137121269, '(Category -2137121269)') This is close to the 32 bit signed integer limit, but not yet there, the lower limit for raster maps of type CELL is at −2,147,483,648 The error "UNIQUE constraint failed: basins.cat" indicates that the given category already exists. Another issue is that negative categories are not allowed for vectors. Yet another issue is that basin numbers should be all positive, indicating that integer overflow occurred when creating the raster map basins, which is probably the root of this problem. What is the range of values in the raster map basins (r.info -r basins) and how was it created? Markus M ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user
Re: [GRASS-user] too many categories: buffer overflow
It looks like you've run out of integer values for the "category" primary key. Do you really want a vector polygon map with > 2 billion features? On 18/06/2019 15:14, Ken Mankoff wrote: Hello, I think I'm experiencing a buffer overflow. This is a hard one to search for with GRASS GIS because the word "buffer" and "overflow" appear throughout as in r.buffer and overflowing weirs, etc. but I'm referring to the C-code error type of buffer oveflow: r.to.vect -v input=basins output=basins type=area DBMI-SQLite driver error: Error in sqlite3_step(): UNIQUE constraint failed: basins.cat ERROR: Unable to insert into table: insert into basins values ( -2137121269, '(Category -2137121269)') Does anyone have any suggestion how I can convert a raster with MANY categories to a vector? Thanks, -k. ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user -- Micha Silver Ben Gurion Univ. Sde Boker, Remote Sensing Lab cell: +972-523-665918 ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user
[GRASS-user] too many categories: buffer overflow
Hello, I think I'm experiencing a buffer overflow. This is a hard one to search for with GRASS GIS because the word "buffer" and "overflow" appear throughout as in r.buffer and overflowing weirs, etc. but I'm referring to the C-code error type of buffer oveflow: r.to.vect -v input=basins output=basins type=area DBMI-SQLite driver error: Error in sqlite3_step(): UNIQUE constraint failed: basins.cat ERROR: Unable to insert into table: insert into basins values ( -2137121269, '(Category -2137121269)') Does anyone have any suggestion how I can convert a raster with MANY categories to a vector? Thanks, -k. ___ grass-user mailing list grass-user@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/grass-user