Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-20 Thread Dean Rasheed
On 18 February 2016 at 10:05, Dean Rasheed  wrote:
> OK, I'll add a check for that.
> Thanks for testing.
>

Pushed, with a catversion bump.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-18 Thread Dean Rasheed
On 18 February 2016 at 02:00, Vitaly Burovoy  wrote:
>> + else
>> + have_digits = false;
> Does it worth to move it to the declaration and remove that "else" block?
> +   boolhave_digits = false;

OK, that's probably a bit neater.


> Is it important to use:
>> +  ObjectIdGetDatum(InvalidOid),
>> +  Int32GetDatum(-1)));
> instead of "0, -1"? Previously I left immediate constants because I
> found the same thing in jsonb.c (rows 363, 774)...

I think it's preferable to use the macros because they make it clearer
what datatypes are being passed around. I think that's the more common
style, but the JSONB code is not technically wrong either.


>> +  if (pg_strcasecmp(strptr, "bytes") == 0)
>> +  else if
>> +  else if
>> +  else if
> It seems little ugly for me. In that case (not using values from GUC)
> I'd create a static array for it and do a small cycle via it. Would it
> better?

That's a matter of personal preference. I prefer the values inline
because then the values are closer to where they're being used, which
for me makes it easier to follow (see e.g., convert_priv_string()).

In guc.c they're in lookup tables because they're referred to from
multiple places, and because it needs to switch between tables based
on context, neither of which is true here. If there is a need to
re-use these values elsewhere in the future it can be refactored, but
right now that feels like an over-engineered solution for this
problem.


>> + NumericGetDatum(mul_num),
> Two more space for indentation.

pgindent pulled that back.


> Also I've found that with allowing exponent there is a weird result
> (we tried to avoid "type numeric" in the error message):
> SELECT 
> pg_size_bytes('123e1000
> kB');
> ERROR:  invalid input syntax for type numeric:
> "123e1000"

OK, I'll add a check for that.
Thanks for testing.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-17 Thread Vitaly Burovoy
On 2/17/16, Dean Rasheed  wrote:
> On 17 February 2016 at 00:39, Vitaly Burovoy 
> wrote:
>> Now parse_memory_unit returns -1024 for bytes as divider, constant
>> "bytes" has moved there.
>> Add new memory_units_bytes_hint which differs from an original
>> memory_units_int by "bytes" size unit.
>> Allow hintmsg be NULL and if so, skip setting dereferenced variable to
>> memory_units_bytes_hint.
>>
>
> I think that approach is getting more and more unwieldy, and it simply
> isn't worth the effort just to share a few values from the unit
> conversion table, especially given that the set of supported units
> differs between the two places.
>
>> On 2/16/16, Dean Rasheed  wrote:
>>> ISTM that it would be far less code, and much simpler and more
>>> readable to just parse the supported units directly in
>>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>>> supported units are actually different and may well diverge further in
>>> the future.
>
> I've gone with this approach and it is indeed far less code, and much
> simpler and easier to read. This will also make it easier to
> maintain/extend in the future.
>
> I've made a few minor tweaks to the docs, and added a note to make it
> clear that the units in these functions work in powers of 2 not 10.
>
> I also took the opportunity to tidy up the number scanning code
> somewhat (I was tempted to rip it out entirely, since it feels awfully
> close to duplicating the numeric code, but it's probably worth it for
> the better error message).

Yes, the better error message was the only reason to leave that scan.

> Additionally, note that I replaced strcasecmp() with pg_strcasecmp(),
> since AIUI the former is not available on all supported platforms.

Thank you. I didn't know that function exists.

> Barring objections, and subject to some more testing, I intend to
> commit this version.
>
> Regards,
> Dean

Than you for your work. Your version seems even better that refactored by me.
But I have several questions (below).

> + boolhave_digits;
Agree, "main_digits" and "error" together was redundant, "have_digits"
is enough.


> + else
> + have_digits = false;
Does it worth to move it to the declaration and remove that "else" block?
+   boolhave_digits = false;


Is it important to use:
> +  ObjectIdGetDatum(InvalidOid),
> +  Int32GetDatum(-1)));
instead of "0, -1"? Previously I left immediate constants because I
found the same thing in jsonb.c (rows 363, 774)...


> +  if (pg_strcasecmp(strptr, "bytes") == 0)
> +  else if
> +  else if
> +  else if
It seems little ugly for me. In that case (not using values from GUC)
I'd create a static array for it and do a small cycle via it. Would it
better?


> + errmsg("invalid size: \"%s\"", text_to_cstring(arg)),
Agree. Usage of "text_to_cstring" again is a good idea for improving
readability.


> + NumericGetDatum(mul_num),
Two more space for indentation.


Also I've found that with allowing exponent there is a weird result
(we tried to avoid "type numeric" in the error message):
SELECT 
pg_size_bytes('123e1000
kB');
ERROR:  invalid input syntax for type numeric:
"123e1000"


[1]http://www.postgresql.org/docs/9.5/static/error-style-guide.html#AEN110702
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-17 Thread Dean Rasheed
On 17 February 2016 at 00:39, Vitaly Burovoy  wrote:
> On 2/16/16, Vitaly Burovoy  wrote:
>> On 2/16/16, Dean Rasheed  wrote:
>>> Fixing that in parse_memory_unit() would be messy because it assumes a
>>> base unit of kB, so it would require a negative multiplier, and
>>> pg_size_bytes() would have to be taught to divide by the magnitude of
>>> negative multipliers in the same way that guc.c does.
>
> Now parse_memory_unit returns -1024 for bytes as divider, constant
> "bytes" has moved there.
> Add new memory_units_bytes_hint which differs from an original
> memory_units_int by "bytes" size unit.
> Allow hintmsg be NULL and if so, skip setting dereferenced variable to
> memory_units_bytes_hint.
>

I think that approach is getting more and more unwieldy, and it simply
isn't worth the effort just to share a few values from the unit
conversion table, especially given that the set of supported units
differs between the two places.


>>> ISTM that it would be far less code, and much simpler and more
>>> readable to just parse the supported units directly in
>>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>>> supported units are actually different and may well diverge further in
>>> the future.
>>

I've gone with this approach and it is indeed far less code, and much
simpler and easier to read. This will also make it easier to
maintain/extend in the future.

I've made a few minor tweaks to the docs, and added a note to make it
clear that the units in these functions work in powers of 2 not 10.

I also took the opportunity to tidy up the number scanning code
somewhat (I was tempted to rip it out entirely, since it feels awfully
close to duplicating the numeric code, but it's probably worth it for
the better error message).

Additionally, note that I replaced strcasecmp() with pg_strcasecmp(),
since AIUI the former is not available on all supported platforms.

Barring objections, and subject to some more testing, I intend to
commit this version.

Regards,
Dean
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index f9eea76..60f117a
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17767,6 +17767,9 @@ postgres=# SELECT * FROM pg_xlogfile_nam
 pg_relation_size


+pg_size_bytes
+   
+   
 pg_size_pretty


@@ -17838,6 +17841,15 @@ postgres=# SELECT * FROM pg_xlogfile_nam
   
   

+pg_size_bytes(text)
+
+   bigint
+   
+ Converts a size in human-readable format with size units into bytes
+   
+  
+  
+   
 pg_size_pretty(bigint)
 
text
@@ -17968,11 +17980,27 @@ postgres=# SELECT * FROM pg_xlogfile_nam
 

 pg_size_pretty can be used to format the result of one of
-the other functions in a human-readable way, using kB, MB, GB or TB as
-appropriate.
+the other functions in a human-readable way, using bytes, kB, MB, GB or TB
+as appropriate.

 

+pg_size_bytes can be used to get the size in bytes from a
+string in human-readable format. The input may have units of bytes, kB,
+MB, GB or TB, and is parsed case-insensitively. If no units are specified,
+bytes are assumed.
+   
+
+   
+
+ The units kB, MB, GB and TB used by the functions
+ pg_size_pretty and pg_size_bytes are defined
+ using powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is
+ 10242 = 1048576 bytes, and so on.
+
+   
+
+   
 The functions above that operate on tables or indexes accept a
 regclass argument, which is simply the OID of the table or index
 in the pg_class system catalog.  You do not have to look up
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 2084692..91260cd
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -700,6 +700,145 @@ pg_size_pretty_numeric(PG_FUNCTION_ARGS)
 }
 
 /*
+ * Convert a human-readable size to a size in bytes
+ */
+Datum
+pg_size_bytes(PG_FUNCTION_ARGS)
+{
+	text	   *arg = PG_GETARG_TEXT_PP(0);
+	char	   *str,
+			   *strptr,
+			   *endptr;
+	bool		have_digits;
+	char		saved_char;
+	Numeric		num;
+	int64		result;
+
+	str = text_to_cstring(arg);
+
+	/* Skip leading whitespace */
+	strptr = str;
+	while (isspace((unsigned char) *strptr))
+		strptr++;
+
+	/* Check that we have a valid number and determine where it ends */
+	endptr = strptr;
+
+	/* Part (1): sign */
+	if (*endptr == '-' || *endptr == '+')
+		endptr++;
+
+	/* Part (2): main digit string */
+	if (isdigit((unsigned char) *endptr))
+	{
+		have_digits = true;
+		do
+			endptr++;
+		while (isdigit((unsigned char) *endptr));
+	}
+	else
+		have_digits = false;
+
+	/* Part (3): optional decimal point and fractional digits */
+	if (*endptr == '.')
+	{
+		endptr++;
+		if (isdigit((unsigned char) *endptr))
+		{
+			have_digits = 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-16 Thread Vitaly Burovoy
On 2/16/16, Vitaly Burovoy  wrote:
> On 2/16/16, Dean Rasheed  wrote:
>> Fixing that in parse_memory_unit() would be messy because it assumes a
>> base unit of kB, so it would require a negative multiplier, and
>> pg_size_bytes() would have to be taught to divide by the magnitude of
>> negative multipliers in the same way that guc.c does.

Now parse_memory_unit returns -1024 for bytes as divider, constant
"bytes" has moved there.
Add new memory_units_bytes_hint which differs from an original
memory_units_int by "bytes" size unit.
Allow hintmsg be NULL and if so, skip setting dereferenced variable to
memory_units_bytes_hint.

>> ISTM that it would be far less code, and much simpler and more
>> readable to just parse the supported units directly in
>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>> supported units are actually different and may well diverge further in
>> the future.
>
> Initially it was not shared with guc.c and now it is by the letter[1]
> of Oleksandr Shulgin and Robert Haas.
>
>> I'll try to hack something up, and see what it looks like.
>>
>> Regards,
>> Dean

Current version contains correct hint message, it passes tests.
Besides mentioned above there are some comment rewordings, and
variable renaming.

[1]http://www.postgresql.org/message-id/ca+tgmoanot7ugjsbibfuqgachk2lzrypmkjvvugp5r197yu...@mail.gmail.com

-- 
Best regards,
Vitaly Burovoy


pg-size-bytes-11chgd2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-16 Thread Vitaly Burovoy
On 2/16/16, Vitaly Burovoy  wrote:
> On 2/16/16, Dean Rasheed  wrote:
>> On 16 February 2016 at 05:01, Pavel Stehule 
>> wrote:
>>> 2016-02-15 10:16 GMT+01:00 Dean Rasheed :
>> Fixing that in parse_memory_unit() would be messy because it assumes a
>> base unit of kB, so it would require a negative multiplier, and
>> pg_size_bytes() would have to be taught to divide by the magnitude of
>> negative multipliers in the same way that guc.c does.

Well... pg_size_bytes already knows what negative multiplier is (see
the constant skip_unit), but it seems it is hard to teach other
functions expecting values in KB to understand negative multipliers as
dividers.

> I guess the best way is to simply make "bytes" a valid size unit even
> in GUC by adding it to the memory_unit_conversion_table

Oops. I forgot they are not in bytes.

> with reflecting it in memory_units_hint and removing
> an extra checking from pg_size_bytes.
>
>> ISTM that it would be far less code, and much simpler and more
>> readable to just parse the supported units directly in
>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>> supported units are actually different and may well diverge further in
>> the future.

Initially it was not shared with guc.c and now it is by the letter[1]
of Oleksandr Shulgin and Robert Haas.

I guess since parse_memory_unit uses memory_unit_conversion_table
where values (not for GUC_UNIT_KB but nevertheless) can be negatives
and for KB it can be a single value (-1024), move "bytes" check to
parse_memory_unit, add a constant memory_units_bytes_hint as
copy-paste with included "bytes" size unit (and with a comment it is
special for parse_memory_unit function).

Also change "skip_unit" in dbsize.c to -1024.

>> I'll try to hack something up, and see what it looks like.

[1]http://www.postgresql.org/message-id/ca+tgmoanot7ugjsbibfuqgachk2lzrypmkjvvugp5r197yu...@mail.gmail.com
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-16 Thread Vitaly Burovoy
On 2/16/16, Dean Rasheed  wrote:
> On 16 February 2016 at 05:01, Pavel Stehule 
> wrote:
>> 2016-02-15 10:16 GMT+01:00 Dean Rasheed :
>>> Is there any reason not to make
>>> pg_size_bytes() return numeric?
>>>
>> This is a question. I have not a strong opinion about it. There are no
>> any
>> technical objection - the result will be +/- same. But you will enforce
>> Numeric into outer expression evaluation.
>>
>> The result will not be used together with function pg_size_pretty, but
>> much
>> more with functions pg_relation_size, pg_relation_size, .. and these
>> functions doesn't return Numeric. These functions returns bigint. Bigint
>> is
>> much more natural type for this purpose.
>>
>> Is there any use case for Numeric?
>>
>
> [Shrug] I don't really have a strong opinion about it either, but it
> seemed that maybe the function might have some wider uses beyond just
> comparing object sizes, and since it's already computing the numeric
> size, it might as well just return it.

I agree with Pavel, it leads to a comparison in numeric, which is
overhead. int8 can always be casted to numeric on-demand, but not vice
versa. The main reasons to return int8 instead of numeric are
performance and inability to imagine use case where so big numbers
could take place.

> Looking at the rest of the patch, I think there are other things that
> need tidying up -- the unit parsing code for one. This is going to
> some effort to reuse the memory_unit_conversion_table from guc.c, but
> the result is that it has added quite a bit of new code and now the
> responsibility for parsing different units is handled by different
> functions in different files, which IMO is quite messy. Worse, the
> error message is wrong and misleading:
>
> select pg_size_bytes('10 bytes'); -- OK
> select pg_size_bytes('10 gallons');
> ERROR:  invalid size: "10 gallons"
> DETAIL:  Invalid size unit "gallons"
> HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".
>
> which fails to mention that "bytes" is also a valid unit.

:-D

Yes, it is fail. I missed it...

> Fixing that in parse_memory_unit() would be messy because it assumes a
> base unit of kB, so it would require a negative multiplier, and
> pg_size_bytes() would have to be taught to divide by the magnitude of
> negative multipliers in the same way that guc.c does.

I guess the best way is to simply make "bytes" a valid size unit even
in GUC by adding it to the memory_unit_conversion_table with
reflecting it in memory_units_hint and removing an extra checking from
pg_size_bytes.

> ISTM that it would be far less code, and much simpler and more
> readable to just parse the supported units directly in
> pg_size_bytes(), rather than trying to share code with guc.c, when the
> supported units are actually different and may well diverge further in
> the future.
>
> I'll try to hack something up, and see what it looks like.
>
> Regards,
> Dean

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-16 Thread Pavel Stehule
2016-02-16 11:25 GMT+01:00 Dean Rasheed :

> On 16 February 2016 at 05:01, Pavel Stehule 
> wrote:
> > 2016-02-15 10:16 GMT+01:00 Dean Rasheed :
> >> Is there any reason not to make
> >> pg_size_bytes() return numeric?
> >>
> > This is a question. I have not a strong opinion about it. There are no
> any
> > technical objection - the result will be +/- same. But you will enforce
> > Numeric into outer expression evaluation.
> >
> > The result will not be used together with function pg_size_pretty, but
> much
> > more with functions pg_relation_size, pg_relation_size, .. and these
> > functions doesn't return Numeric. These functions returns bigint. Bigint
> is
> > much more natural type for this purpose.
> >
> > Is there any use case for Numeric?
> >
>
> [Shrug] I don't really have a strong opinion about it either, but it
> seemed that maybe the function might have some wider uses beyond just
> comparing object sizes, and since it's already computing the numeric
> size, it might as well just return it.
>

I see only one disadvantage - when we return numeric, then all following
expression will be in numeric, and for some functions, or some expression
the users will have to cast explicitly to bigint.


>
>
> Looking at the rest of the patch, I think there are other things that
> need tidying up -- the unit parsing code for one. This is going to
> some effort to reuse the memory_unit_conversion_table from guc.c, but
> the result is that it has added quite a bit of new code and now the
> responsibility for parsing different units is handled by different
> functions in different files, which IMO is quite messy. Worse, the
> error message is wrong and misleading:
>
> select pg_size_bytes('10 bytes'); -- OK
> select pg_size_bytes('10 gallons');
> ERROR:  invalid size: "10 gallons"
> DETAIL:  Invalid size unit "gallons"
> HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".
>
> which fails to mention that "bytes" is also a valid unit.
>

true


>
> Fixing that in parse_memory_unit() would be messy because it assumes a
> base unit of kB, so it would require a negative multiplier, and
> pg_size_bytes() would have to be taught to divide by the magnitude of
> negative multipliers in the same way that guc.c does.
>
> ISTM that it would be far less code, and much simpler and more
> readable to just parse the supported units directly in
> pg_size_bytes(), rather than trying to share code with guc.c, when the
> supported units are actually different and may well diverge further in
> the future.
>

yes, if we have different units, then independent control tables has more
sense.



>
> I'll try to hack something up, and see what it looks like.
>
> Regards,
> Dean
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-16 Thread Dean Rasheed
On 16 February 2016 at 05:01, Pavel Stehule  wrote:
> 2016-02-15 10:16 GMT+01:00 Dean Rasheed :
>> Is there any reason not to make
>> pg_size_bytes() return numeric?
>>
> This is a question. I have not a strong opinion about it. There are no any
> technical objection - the result will be +/- same. But you will enforce
> Numeric into outer expression evaluation.
>
> The result will not be used together with function pg_size_pretty, but much
> more with functions pg_relation_size, pg_relation_size, .. and these
> functions doesn't return Numeric. These functions returns bigint. Bigint is
> much more natural type for this purpose.
>
> Is there any use case for Numeric?
>

[Shrug] I don't really have a strong opinion about it either, but it
seemed that maybe the function might have some wider uses beyond just
comparing object sizes, and since it's already computing the numeric
size, it might as well just return it.


Looking at the rest of the patch, I think there are other things that
need tidying up -- the unit parsing code for one. This is going to
some effort to reuse the memory_unit_conversion_table from guc.c, but
the result is that it has added quite a bit of new code and now the
responsibility for parsing different units is handled by different
functions in different files, which IMO is quite messy. Worse, the
error message is wrong and misleading:

select pg_size_bytes('10 bytes'); -- OK
select pg_size_bytes('10 gallons');
ERROR:  invalid size: "10 gallons"
DETAIL:  Invalid size unit "gallons"
HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".

which fails to mention that "bytes" is also a valid unit.

Fixing that in parse_memory_unit() would be messy because it assumes a
base unit of kB, so it would require a negative multiplier, and
pg_size_bytes() would have to be taught to divide by the magnitude of
negative multipliers in the same way that guc.c does.

ISTM that it would be far less code, and much simpler and more
readable to just parse the supported units directly in
pg_size_bytes(), rather than trying to share code with guc.c, when the
supported units are actually different and may well diverge further in
the future.

I'll try to hack something up, and see what it looks like.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-15 Thread Pavel Stehule
Hi

2016-02-15 10:16 GMT+01:00 Dean Rasheed :

> > On 12/02/16 10:19, Dean Rasheed wrote:
> >> This seems like a reasonable first patch for me as a committer, so
> >> I'll take it unless anyone else was planning to do so.
> >
>
> So looking at this, it seems that for the most part pg_size_bytes()
> will parse any output produced by pg_size_pretty(). The exception is
> that there are 2 versions of pg_size_pretty(), one that takes bigint
> and one that takes numeric, whereas pg_size_bytes() returns bigint, so
> it can't handle all inputs. Is there any reason not to make
> pg_size_bytes() return numeric?


> It would still be compatible with the example use cases, but it would
> be a better inverse of both variants of pg_size_pretty() and would be
> more future-proof. It already works internally using numeric, so it's
> a trivial change to make now, but impossible to change in the future
> without introducing a new function with a different name, which is
> messy.
>
> Thoughts?
>

This is a question. I have not a strong opinion about it. There are no any
technical objection - the result will be +/- same. But you will enforce
Numeric into outer expression evaluation.

The result will not be used together with function pg_size_pretty, but much
more with functions pg_relation_size, pg_relation_size, .. and these
functions doesn't return Numeric. These functions returns bigint. Bigint is
much more natural type for this purpose.

Is there any use case for Numeric?

Regards

Pavel


>
> Regards,
> Dean
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-15 Thread Vitaly Burovoy
On 2/15/16, Vitaly Burovoy  wrote:
> P.S.: "bytes" size unit was added just for consistency: each group
> should have a name, even with an exponent of 1.

Oops... Of course, "even with an exponent of 0".
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-15 Thread Vitaly Burovoy
On 2/15/16, Dean Rasheed  wrote:
>> On 12/02/16 10:19, Dean Rasheed wrote:
>>> This seems like a reasonable first patch for me as a committer, so
>>> I'll take it unless anyone else was planning to do so.
>>
>
> So looking at this, it seems that for the most part pg_size_bytes()
> will parse any output produced by pg_size_pretty(). The exception is
> that there are 2 versions of pg_size_pretty(), one that takes bigint
> and one that takes numeric, whereas pg_size_bytes() returns bigint, so
> it can't handle all inputs. Is there any reason not to make
> pg_size_bytes() return numeric?

I guess the only reason is a comparison purpose. The example from the
first letter[1]:

SELECT * FROM pg_class
  WHERE pg_table_size(oid) > pg_human_size('2GB');

... but not for something like:
WITH sel AS (
  SELECT pg_size_pretty(oid) AS _size, * FROM pg_class WHERE ...
)
SELECT pg_size_bytes(_size), * FROM sel;


So the use case of the "pg_size_bytes" is getting a value to compare
with a result from pg_table_size, ..., even with pg_database_size,
_each_ of whom returns bigint (or just int), but not numeric:

postgres=# \df pg_*_size
   List of functions
   Schema   |  Name  | Result data type | Argument
data types |  Type
++--+-+
 pg_catalog | pg_column_size | integer  | "any"
   | normal
 pg_catalog | pg_database_size   | bigint   | name
   | normal
 pg_catalog | pg_database_size   | bigint   | oid
   | normal
 pg_catalog | pg_indexes_size| bigint   | regclass
   | normal
 pg_catalog | pg_relation_size   | bigint   | regclass
   | normal
 pg_catalog | pg_relation_size   | bigint   | regclass,
text  | normal
 pg_catalog | pg_table_size  | bigint   | regclass
   | normal
 pg_catalog | pg_tablespace_size | bigint   | name
   | normal
 pg_catalog | pg_tablespace_size | bigint   | oid
   | normal
 pg_catalog | pg_total_relation_size | bigint   | regclass
   | normal
(10 rows)

The commit message for "pg_size_pretty(numeric)" explains[2] it was
introduced for using with "pg_xlog_location_diff"[3] only.
Since "pg_size_bytes" supports up to (4 EiB-1B)[4] I hope nobody will
want to send a query like "tell me where difference between two
transaction log locations is bigger or equal than 4EiB"...

But comparison between int8 and numeric is OK, so usage of rational
input will not break queries:

postgres=# select '10e200'::numeric > 10::int8 as check;
 check
---
 t
(1 row)


P.S.: "bytes" size unit was added just for consistency: each group
should have a name, even with an exponent of 1.

> It would still be compatible with the example use cases, but it would
> be a better inverse of both variants of pg_size_pretty() and would be
> more future-proof. It already works internally using numeric, so it's
> a trivial change to make now, but impossible to change in the future
> without introducing a new function with a different name, which is
> messy.
>
> Thoughts?
>
> Regards,
> Dean

[1]http://www.postgresql.org/message-id/cafj8prd-tgodknxdygecza4on01_urqprwf-8ldkse-6bdh...@mail.gmail.com
[2]http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4a2d7ad76f5f275ef2d6a57e1a61d5bf756349e8
[3]http://www.postgresql.org/docs/devel/static/functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE
[4]https://en.wikipedia.org/wiki/Binary_prefix#Adoption_by_IEC.2C_NIST_and_ISO

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-15 Thread Dean Rasheed
> On 12/02/16 10:19, Dean Rasheed wrote:
>> This seems like a reasonable first patch for me as a committer, so
>> I'll take it unless anyone else was planning to do so.
>

So looking at this, it seems that for the most part pg_size_bytes()
will parse any output produced by pg_size_pretty(). The exception is
that there are 2 versions of pg_size_pretty(), one that takes bigint
and one that takes numeric, whereas pg_size_bytes() returns bigint, so
it can't handle all inputs. Is there any reason not to make
pg_size_bytes() return numeric?

It would still be compatible with the example use cases, but it would
be a better inverse of both variants of pg_size_pretty() and would be
more future-proof. It already works internally using numeric, so it's
a trivial change to make now, but impossible to change in the future
without introducing a new function with a different name, which is
messy.

Thoughts?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-12 Thread Heikki Linnakangas

On 12/02/16 10:19, Dean Rasheed wrote:

On 12 February 2016 at 06:25, Pavel Stehule  wrote:

Thank you for review and other work



This seems like a reasonable first patch for me as a committer, so
I'll take it unless anyone else was planning to do so.


Great! You'll want to copy-edit the comments and docs a bit, to fix 
spellings etc, and run pgindent on it. And don't forget to bump 
catversion! :-)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-12 Thread Dean Rasheed
On 12 February 2016 at 06:25, Pavel Stehule  wrote:
> Thank you for review and other work
>

This seems like a reasonable first patch for me as a committer, so
I'll take it unless anyone else was planning to do so.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
Hi

assigned https://commitfest.postgresql.org/9/514/

Regards

Pavel


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
2016-02-11 7:44 GMT+01:00 Vitaly Burovoy :

> On 2/10/16, Pavel Stehule  wrote:
> > Hi Vitaly,
> >
> > please, can you send your version of this patch, how we talked about it
> in
> > Moscow?
> >
> > Thank you
> >
> > Pavel
>
> Hello, Pavel!
>
> Please find attached my version of the patch (it applies cleanly on
> top of 64d89a9 which is current master).
>
> It is time to change oid. 3331 is used by "bytea_sortsupport", I set
> 3334 to "pg_size_bytes".
>
> I got a code design of numbers checking from json_lex_number in
> src/backend/utils/adt/json.c
> For me it seems more structured. I adapted it a little and it allows
> to add parsing an exponent (like '10е3 Mb') easily for full support of
> numeric (if sometimes it is necessary).
>

yes, it is better structured


>
> When I added "trimming" for size units (playing with avoiding an extra
> buffer), I found there is easy to support "bytes" unit (but "byte" is
> still unsupported).
>

I am little bit unsure about support the unit unsupported by GUC parser.
But for usage in custom space and for this current usage, it is acceptable.


>
>
> Also this version includes all changes I mentioned in my last review[1]:
> 1. parse_memory_unit returns value instead of using a pointer (return
> zero if noting is found) for it;
> 2. all messages are in a single style (nuances are in errdetails);
> 3. "select"s are in uppercase, rephrased and moved a comment block in test;
> 4. several tests are added (also with supporting of "bytes" unit);
> 5. a sentence in a documentation is rephrased (numeric->fixed-point
> number); "bytes" unit is added to both functions;
> 6. fixed indentation a little;
> 7. pfree is removed (it is easier than removing all other allocated
> resources).
>

ok, thank you


>
>
> I still think my changes are little and they are based on your work
> (and research).
>

thank you very much - but you refactoring is significant and helpful. I'll
reassign your version to opened commitfest.

Regards

Pavel


>
> [1]
> http://www.postgresql.org/message-id/cakoswnk13wvdem06lro-hucr0pr6et29+dvqy6j5skxzaru...@mail.gmail.com
>
> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
This is last incarnation of my patch (cleaned and refactored by Vitaly Burovoy)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
Hi

2016-02-12 2:28 GMT+01:00 Vitaly Burovoy :

> Hello!
>
> On 2/11/16, Pavel Stehule  wrote:
> > Hi
> >
> > assigned https://commitfest.postgresql.org/9/514/
> >
> > Regards
> > Pavel
>
>
> This patch was almost done by the end of the previous CF(2016-01):
> there was few little flaws which are solved by now.
>
> I have reviewed this patch.
> It applies cleanly at the top of current master
> (f144f73242acef574bc27a4c70e809a64806e4a4), compiles silently and
> implements the behavior reached by consensus.
>
> All size units (the same as used in the GUC) are supported.
> In addition "bytes" is also supported that makes it be a inverse
> function for the pg_size_pretty[1].
>
> The documentation describes behavior of the function and possible size
> units. I don't see what else needs to be said.
>
> The code is clean and commented.
>
> Regression tests cover possible use cases and corner cases.
>
>
> Notes for a committer:
> 1. pg_proc.h has changed, so the CATALOG_VERSION_NO must also be changed.
> 2. Code comments, error message strings, DESCR in pg_proc.h and
> changes in the documentation need proof-reading by a native English
> speaker, which the Author and I are not.
>

Thank you for review and other work

Nice day

Pavel


>
>
> [CF] https://commitfest.postgresql.org/9/514/
> [1]
> http://www.postgresql.org/message-id/ca+tgmozfomg4eyorzzgf7pzotg9pxpuhtqvxlfskim4izh8...@mail.gmail.com
>
> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
Hi

2016-02-11 7:44 GMT+01:00 Vitaly Burovoy :

> On 2/10/16, Pavel Stehule  wrote:
> > Hi Vitaly,
> >
> > please, can you send your version of this patch, how we talked about it
> in
> > Moscow?
> >
> > Thank you
> >
> > Pavel
>
> Hello, Pavel!
>
> Please find attached my version of the patch (it applies cleanly on
> top of 64d89a9 which is current master).
>
>
I tested this patch and all tests passed

Regards

Pavel


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Vitaly Burovoy
Hello!

On 2/11/16, Pavel Stehule  wrote:
> Hi
>
> assigned https://commitfest.postgresql.org/9/514/
>
> Regards
> Pavel


This patch was almost done by the end of the previous CF(2016-01):
there was few little flaws which are solved by now.

I have reviewed this patch.
It applies cleanly at the top of current master
(f144f73242acef574bc27a4c70e809a64806e4a4), compiles silently and
implements the behavior reached by consensus.

All size units (the same as used in the GUC) are supported.
In addition "bytes" is also supported that makes it be a inverse
function for the pg_size_pretty[1].

The documentation describes behavior of the function and possible size
units. I don't see what else needs to be said.

The code is clean and commented.

Regression tests cover possible use cases and corner cases.


Notes for a committer:
1. pg_proc.h has changed, so the CATALOG_VERSION_NO must also be changed.
2. Code comments, error message strings, DESCR in pg_proc.h and
changes in the documentation need proof-reading by a native English
speaker, which the Author and I are not.


[CF] https://commitfest.postgresql.org/9/514/
[1] 
http://www.postgresql.org/message-id/ca+tgmozfomg4eyorzzgf7pzotg9pxpuhtqvxlfskim4izh8...@mail.gmail.com

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-10 Thread Pavel Stehule
Hi Vitaly,

please, can you send your version of this patch, how we talked about it in
Moscow?

Thank you

Pavel


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-10 Thread Vitaly Burovoy
On 2/10/16, Pavel Stehule  wrote:
> Hi Vitaly,
>
> please, can you send your version of this patch, how we talked about it in
> Moscow?
>
> Thank you
>
> Pavel

Hello, Pavel!

Please find attached my version of the patch (it applies cleanly on
top of 64d89a9 which is current master).

It is time to change oid. 3331 is used by "bytea_sortsupport", I set
3334 to "pg_size_bytes".

I got a code design of numbers checking from json_lex_number in
src/backend/utils/adt/json.c
For me it seems more structured. I adapted it a little and it allows
to add parsing an exponent (like '10е3 Mb') easily for full support of
numeric (if sometimes it is necessary).

When I added "trimming" for size units (playing with avoiding an extra
buffer), I found there is easy to support "bytes" unit (but "byte" is
still unsupported).


Also this version includes all changes I mentioned in my last review[1]:
1. parse_memory_unit returns value instead of using a pointer (return
zero if noting is found) for it;
2. all messages are in a single style (nuances are in errdetails);
3. "select"s are in uppercase, rephrased and moved a comment block in test;
4. several tests are added (also with supporting of "bytes" unit);
5. a sentence in a documentation is rephrased (numeric->fixed-point
number); "bytes" unit is added to both functions;
6. fixed indentation a little;
7. pfree is removed (it is easier than removing all other allocated resources).


I still think my changes are little and they are based on your work
(and research).

[1]http://www.postgresql.org/message-id/cakoswnk13wvdem06lro-hucr0pr6et29+dvqy6j5skxzaru...@mail.gmail.com

-- 
Best regards,
Vitaly Burovoy


pg-size-bytes-11chgd.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-31 Thread Tom Lane
Vitaly Burovoy  writes:
> On 1/30/16, Pavel Stehule  wrote:
>> I though about it, but it is not possible. Every PG_TRY/CATCH must be
>> finished by RETHROW.

> No, src/include/utils/elog.h tells different (emphasizes are mine):
> "The error recovery code can _optionally_ do PG_RE_THROW() to
> propagate the _same_ error outwards."

> So you can use it without rethrowing.

You can use it without re-throwing, but ONLY if you use the full
subtransaction mechanism to clean up.  Assuming that the only possible
errors are ones that don't need cleanup will get your patch rejected.

IME, in practically all cases in simple C functions, it's better to find
a way that doesn't require using TRY/CATCH to catch an expected error.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-30 Thread Vitaly Burovoy
Hello,Pavel!

On 1/26/16, Pavel Stehule  wrote:
> I am grateful for review - now this patch is better, and I hope near final
> stage.
>
> Regards
> Pavel

I'm pretty sure we are close to it.

Now besides a code design I mentioned[1] before (and no one has
answered me about it), there are a few small notes.

Now you have all messages in one style ("invalid size: \"%s\"") except
size units ("invalid unit \"%s\", there are two places). I guess
according to the Error Style Guide[2] it should be like
"errmsg("invalid size: \"%s\"", str), errdetail("Invalid size unit
\"%s\"", unitstr),".

If it is not hard for you, it'll look better if "select" is uppercased
in tests and a comment about sharing "memory_unit_conversion_table" is
close to the "SELECT pg_size_bytes('1PB');". Moreover your comment has
a flaw: "pg_size_pretty" doesn't use that array at all, so the phrase
"These functions shares" is wrong.

Also I've just found that numeric format supports values with exponent
like "1e5" which is not allowed in pg_size_bytes. I guess the phrase
"The format is numeric with an optional size unit" should be replaced
to "The format is a fixed point number with an optional size unit" in
the documentation.

Have a look for a formatting: in the rows "num = DatumGetNumeric",
"mul_num = DatumGetNumeric", "num = DatumGetNumeric" and in error
reporting close to the "/* only alphabetic character are allowed */"
parameters in the next lines are not under the first parameters.

I insist there is no reason to call "pfree" for "str" and "buffer".
But if you do so, clean up also buffers from numeric_in, numeric_mul
and int8_numeric. If you insist it should be left as is, I leave that
decision to a committer.

P.S.: Have you thought to wrap the call "numeric_in" by a
PG_TRY/PG_CATCH instead of checking for correctness by yourself?

[1]http://www.postgresql.org/message-id/cakoswnm+ssggkysv09n_6hhfwzmrr+csjpgfhbpff5nnpfj...@mail.gmail.com
[2]http://www.postgresql.org/docs/devel/static/error-style-guide.html#AEN65
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-30 Thread Pavel Stehule
Hi


> P.S.: Have you thought to wrap the call "numeric_in" by a
> PG_TRY/PG_CATCH instead of checking for correctness by yourself?
>

I though about it, but it is not possible. Every PG_TRY/CATCH must be
finished by RETHROW. Only when you will open subtransaction and you are
playing with resource manager, you can do it. It is pretty expensive.

You can see in our code lot of situation when some function returns true,
false, "error message" instead raising a exception. I would not to refactor
numericin function in this style. This function is in critical path of COPY
FROM, and any more calls can decrease performance. And then I have to do
these checks before calling.

Regards

Pavel


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-30 Thread Vitaly Burovoy
On 1/30/16, Pavel Stehule  wrote:
>> P.S.: Have you thought to wrap the call "numeric_in" by a
>> PG_TRY/PG_CATCH instead of checking for correctness by yourself?
>
> I though about it, but it is not possible. Every PG_TRY/CATCH must be
> finished by RETHROW.

No, src/include/utils/elog.h tells different (emphasizes are mine):
"The error recovery code can _optionally_ do PG_RE_THROW() to
propagate the _same_ error outwards."

So you can use it without rethrowing.

> Only when you will open subtransaction and you are playing with resource 
> manager, you can do it.

Really? I did not find it around the "#define PG_TRY()" definition in
"src/include/utils/elog.h".

I guess it is important to use a subtransaction if you want to catch
an exception and go further. In case of calling "numeric_in" from the
"pg_size_bytes" there is no reason to use a subtransaction that may
close any open relation etc., because a new ereport with different
message is emitted, which fails the current transaction anyway.

PG_TRY is only calls sigsetjmp and sets PG_exception_stack. If no
exception is emitted, penalty is the only sigsetjmp call (but I don't
know how heavy it is), if an exception is emitted, there is no matter
how long it handles.

> It is pretty expensive.

Ok. Performance makes sense.

> You can see in our code lot of situation when some function returns true,
> false, "error message" instead raising a exception.

I know. It is a common style in C programs.

> I would not to refactor numeric_in function in this style.

No doubt. It is not necessary.

> This function is in critical path of COPY
> FROM, and any more calls can decrease performance. And then I have to do
> these checks before calling.
>
> Regards
> Pavel

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-26 Thread Pavel Stehule
Hi

2016-01-26 13:48 GMT+01:00 Vitaly Burovoy :

> Hello, Pavel!
>
> That letter was not a complain against you. I'm sorry if it seems like
> that for you.
>

ok.

It was an intermediate review with several points to be clear for _me_
> from experienced hackers, mostly about a code design.
>
> 26.01.2016 07:05, Pavel Stehule пишет:
> >> pg_proc.h has changed, so the CATALOG_VERSION_NO must be also changed.
> > this is not a part of patch - only a commiter knows CATALOG_VERSION_NO
> >
> CATALOG_VERSION_NO is mentioned for a committer, it is not your fault.
>
> >> III. There is no support of 'bytes' unit, it seems such behavior got
> >> majority approval[2].
> >
> > No, because I have to use the supported units by configuration. The
> configuration supports only three chars long units. Support for "bytes" was
> removed, when I removed proprietary unit table.
> >
> Point "III" is the only for the question "d". Also to collect any
> possible features from the thread in one place.
>
> >> V. The documentation lacks a note that the base of the "pg_size_bytes"
> >> is 1024 whereas the base of the "pg_size_pretty" is 1000.
> >
> > It isn't true, base for both are 1024. It was designed when special
> table was used for pg_size_bytes. But when we share same control table,
> then is wrong to use different table. The result can be optically
> different, but semantically same.
> >
> Oops, I was wrong about a base of pg_size_pretty. It was a morning
> after a hard night...


> > negative values is fully supported.
> You have mentioned it at least three times before. It is not my new
> requirement or a point to your fault, it is an argument for
> symmetry/asymmetry of the function.
>
> > support of "bytes" depends on support "bytes" unit by GUC. When "bytes"
> unit will be supported, then it can be used in pg_size_bytes immediately.
> By the way there can be a comparison for a special size unit before
> calling parse_memory_unit.
>

there was more requirements based on original proposal (based in  separated
control tables). When I leaved this design, I dropped more requirements and
now is little bit hard to distinguish what is related and with. Some
features can be added later without any possible compatibility issues in
next commit fest or later, and I am searching some good borders for this
patch. Now, this border is a shared memory unit control table. And I am
trying to hold this border.

I am grateful for review - now this patch is better, and I hope near final
stage.

Regards

Pavel




>
> > Regards
> > Pavel
> >
> >> [2]
> http://www.postgresql.org/message-id/cacaco5qw7ffsffhkstjtycp4qf3oh9za14sc6z3dxx2yssj...@mail.gmail.com
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-26 Thread Vitaly Burovoy
Hello, Pavel!

That letter was not a complain against you. I'm sorry if it seems like
that for you.
It was an intermediate review with several points to be clear for _me_
from experienced hackers, mostly about a code design.

26.01.2016 07:05, Pavel Stehule пишет:
>> pg_proc.h has changed, so the CATALOG_VERSION_NO must be also changed.
> this is not a part of patch - only a commiter knows CATALOG_VERSION_NO
>
CATALOG_VERSION_NO is mentioned for a committer, it is not your fault.

>> III. There is no support of 'bytes' unit, it seems such behavior got
>> majority approval[2].
>
> No, because I have to use the supported units by configuration. The 
> configuration supports only three chars long units. Support for "bytes" was 
> removed, when I removed proprietary unit table.
>
Point "III" is the only for the question "d". Also to collect any
possible features from the thread in one place.

>> V. The documentation lacks a note that the base of the "pg_size_bytes"
>> is 1024 whereas the base of the "pg_size_pretty" is 1000.
>
> It isn't true, base for both are 1024. It was designed when special table was 
> used for pg_size_bytes. But when we share same control table, then is wrong 
> to use different table. The result can be optically different, but 
> semantically same.
>
Oops, I was wrong about a base of pg_size_pretty. It was a morning
after a hard night...

> negative values is fully supported.
You have mentioned it at least three times before. It is not my new
requirement or a point to your fault, it is an argument for
symmetry/asymmetry of the function.

> support of "bytes" depends on support "bytes" unit by GUC. When "bytes" unit 
> will be supported, then it can be used in pg_size_bytes immediately.
By the way there can be a comparison for a special size unit before
calling parse_memory_unit.

> Regards
> Pavel
>
>> [2]http://www.postgresql.org/message-id/cacaco5qw7ffsffhkstjtycp4qf3oh9za14sc6z3dxx2yssj...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-25 Thread Vitaly Burovoy
Hello!

I have reviewed this patch. It applies and compiles cleanly at the
current master 1129c2b0ad2732f301f696ae2cf98fb063a4c1f8 and implements
the behavior reached by a consensus.

All size units (the same as used in the GUC) are supported.

The documentation is present and describes behavior of the function.

The code is mostly clear and commented enough, but I doubt about
design (see p.3).

Regression tests are present (see p.II).

pg_proc.h has changed, so the CATALOG_VERSION_NO must be also changed.

Code comments, error message string, DESCR in pg_proc.h and doc
changes need proof-reading by a native English speaker, which I am
not.


===
The last patch fixes only part of my last notes, so I can only return
it to the next round with the same notes.

The list of my notes has not been done (all points listed below was
mentioned in my last review[1]):

1. The documentation has duplicate parts of size units (in table and
in the text). I think that one of them should be deleted (from the
table).

2. Test has a note about sharing(using) size units via(from)
memory_unit_conversion_table and a requirement to update the
documentation when new units are supported, but the note will not be
seen because there is no tests for that fail for units ("PB", "EB",
"ZB") which will change in the future by the
"memory_unit_conversion_table".

3. Code design is looked a little odd for me.

3a. There is an attempt to check a correctness of a numeric value
instead of using numeric_in which does the same thing. In the last
case the code looks more compact and nice (see the full explanation in
the previous review[1] and both PoC there).

Possibly that way is chosen to emit its own error messages ("invalid
size:" instead of "invalid input syntax for type numeric:"), but it
still leads errors like 'invalid unit: "+ kB"' for input value "+912+
kB" (see "dbsize.out").

3b. There is an extra copying from the buffer "str" to the "buffer"
(it can be avoided and shown in an PoC "single_buffer.c").

3c. There is freeing buffers "str" and "buffer" that looks as a
counter-design since MemoryContext frees them at once with other
expired buffers (like "num" from numeric_in and numeric_mul; also
"mul_num" from int8_numeric)


===
While I was writing that text I found five additional notes. Both
parts (above and below) are important (see concusion).

I. The new version of the patch (pg-size-bytes-10.patch) has the
simplest typo in an updated (from v9) row "inavalid size".

II. There is no test for an empty input value, but it works as
expected (an error 'invalid size: ""').

III. There is no support of 'bytes' unit, it seems such behavior got
majority approval[2].

IV. Code design of the function "parse_memory_unit" is returning
"bool" value and set "multiplier" via pointer. But while Tom Lane was
reviewed my patch[3] in the similar case he changed code that now
returns (see NonFiniteTimestampTzPart) not a flag, but a value (where
zero still means an error). Be honest it looks better, but the current
code is written like nearby functions.

V. The documentation lacks a note that the base of the "pg_size_bytes"
is 1024 whereas the base of the "pg_size_pretty" is 1000.

===
Concusion:

I *can't* mark this patch as "Ready for Committer" with all previous
notes, and I ask experienced hackers for a help.
I'm not a strong Postgres hacker yet and sending to the next round
with the same notes seems a nitpicking for me.

So questions to the hackers:

a. Are the rest of the notes from the previous review[1] (and listed
in pp.1-3c) a nitpicking or should be done?

b. Does the function in PoC "single_buffer.c" look more complex or easer?

c. Is it worth for p.II to change error message like 'Empty input
value is not a valid size for pg_size_bytes: "%s"'?

d. Should the function "pg_size_bytes" handle "bytes" as a size unit
(see p.III)?
There was no disagreement, but if I saw that thread before I became a
reviewer and saw silent approval I would have argued. Moreover
supporting of negative sizes was explained[4] by a symmetry for the
"pg_size_pretty", which should include all size units produced by the
"pg_size_pretty". On the other hand those functions can't be symmetric
due to difference in bases (see p.V and [5])

e. What way is preferred for p.IV?

f. If p.V is actual, how to write it in the documentation (be honest
I've no idea)? Should a description of the "pg_size_pretty" be changed
too?


[1]http://www.postgresql.org/message-id/CAKOSWNn=p8gx-p5y-b4t4dg-aahatup+crg41hq9beobzwx...@mail.gmail.com
[2]http://www.postgresql.org/message-id/cacaco5qw7ffsffhkstjtycp4qf3oh9za14sc6z3dxx2yssj...@mail.gmail.com
[3]http://git.postgresql.org/pg/commitdiff/647d87c56ab6da70adb753c08d7cdf7ee905ea8a
[4]http://www.postgresql.org/message-id/ca+tgmozfomg4eyorzzgf7pzotg9pxpuhtqvxlfskim4izh8...@mail.gmail.com
[5]http://www.postgresql.org/message-id/ca+tgmoygs_xixuqy+xyw9futtmrtbc_s0c1c0_wobwvbnf8...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-25 Thread Pavel Stehule
Hi

2016-01-26 6:25 GMT+01:00 Vitaly Burovoy :

> Hello!
>
>
> Regression tests are present (see p.II).
>
> pg_proc.h has changed, so the CATALOG_VERSION_NO must be also changed.
>

this is not a part of patch - only a commiter knows CATALOG_VERSION_NO


>
> Code comments, error message string, DESCR in pg_proc.h and doc
> changes need proof-reading by a native English speaker, which I am
> not.
>
>
> ===
> The last patch fixes only part of my last notes, so I can only return
> it to the next round with the same notes.
>
> The list of my notes has not been done (all points listed below was
> mentioned in my last review[1]):
>
> 1. The documentation has duplicate parts of size units (in table and
> in the text). I think that one of them should be deleted (from the
> table).
>

fixed


>
> 2. Test has a note about sharing(using) size units via(from)
> memory_unit_conversion_table and a requirement to update the
> documentation when new units are supported, but the note will not be
> seen because there is no tests for that fail for units ("PB", "EB",
> "ZB") which will change in the future by the
> "memory_unit_conversion_table".
>

fixed


> 3. Code design is looked a little odd for me.
>
> 3a. There is an attempt to check a correctness of a numeric value
> instead of using numeric_in which does the same thing. In the last
> case the code looks more compact and nice (see the full explanation in
> the previous review[1] and both PoC there).
>
> Possibly that way is chosen to emit its own error messages ("invalid
> size:" instead of "invalid input syntax for type numeric:"), but it
> still leads errors like 'invalid unit: "+ kB"' for input value "+912+
> kB" (see "dbsize.out").
>

In tried to enhanced this point (I don't agree, so this is a bug)


>
> 3b. There is an extra copying from the buffer "str" to the "buffer"
> (it can be avoided and shown in an PoC "single_buffer.c").
>
> 3c. There is freeing buffers "str" and "buffer" that looks as a
> counter-design since MemoryContext frees them at once with other
> expired buffers (like "num" from numeric_in and numeric_mul; also
> "mul_num" from int8_numeric)
>

Numeric operations are on critical path for OLAP applications - so the code
is rewritten to minimize palloc/free operations. I don't expect usage
pg_size_bytes in OLAP critical path.


>
>
> ===
> While I was writing that text I found five additional notes. Both
> parts (above and below) are important (see concusion).
>
> I. The new version of the patch (pg-size-bytes-10.patch) has the
> simplest typo in an updated (from v9) row "inavalid size".
>

fixed

>
> II. There is no test for an empty input value, but it works as
> expected (an error 'invalid size: ""').
>

fixed

>
> III. There is no support of 'bytes' unit, it seems such behavior got
> majority approval[2].
>

No, because I have to use the supported units by configuration. The
configuration supports only three chars long units. Support for "bytes" was
removed, when I removed proprietary unit table.


>
> IV. Code design of the function "parse_memory_unit" is returning
> "bool" value and set "multiplier" via pointer. But while Tom Lane was
> reviewed my patch[3] in the similar case he changed code that now
> returns (see NonFiniteTimestampTzPart) not a flag, but a value (where
> zero still means an error). Be honest it looks better, but the current
> code is written like nearby functions.
>

It is a commiter decision. This is a implementation detail, and I don't see
any variant as significant better - using 0 as error flag isn't the best
too. But this is really detail.

>
> V. The documentation lacks a note that the base of the "pg_size_bytes"
> is 1024 whereas the base of the "pg_size_pretty" is 1000.
>

It isn't true, base for both are 1024. It was designed when special table
was used for pg_size_bytes. But when we share same control table, then is
wrong to use different table. The result can be optically different, but
semantically same.

postgres=# select pg_size_pretty(pg_size_bytes('1MB'));
┌┐
│ pg_size_pretty │
╞╡
│ 1024 kB│
└┘
(1 row)

Time: 0.787 ms
postgres=# select pg_size_pretty(pg_size_bytes('1024MB'));
┌┐
│ pg_size_pretty │
╞╡
│ 1024 MB│
└┘
(1 row)



>
> ===
> Concusion:
>
> I *can't* mark this patch as "Ready for Committer" with all previous
> notes, and I ask experienced hackers for a help.
> I'm not a strong Postgres hacker yet and sending to the next round
> with the same notes seems a nitpicking for me.
>
> So questions to the hackers:
>
> a. Are the rest of the notes from the previous review[1] (and listed
> in pp.1-3c) a nitpicking or should be done?
>
> b. Does the function in PoC "single_buffer.c" look more complex or easer?
>
> c. Is it worth for p.II to change error message like 'Empty input
> value is not a valid size for pg_size_bytes: "%s"'?
>

It is simply to implement, but It is 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-25 Thread Pavel Stehule
Hi

2016-01-22 7:03 GMT+01:00 Vitaly Burovoy :

> On 1/20/16, Pavel Stehule  wrote:
> > ...
> > New version is attached
> >
> > Regards
> > Pavel
>
> Hello!
>
> 1. Why the function is marked as VOLATILE? It depends only on input
> value, it does change nothing in the DB.
> I guess it should be IMMUTABLE for efficient caching its result.
>

sure, it should be immutable,
fixed

copy/paste bug

?? why pg_size_pretty is volatile??


>
> 2.
> >   text*arg = PG_GETARG_TEXT_PP(0);
> ...
> >   str = text_to_cstring(arg);
> >
> Hmm... My question was _why_ you get TEXT and convert it instead of
> getting an argument of type cstring (not about usage of
> VARSIZE_ANY_EXHDR).
> It was enough to give me a link[1] and leave the usage of
> VARSIZE_ANY_EXHDR as is.
>
>
I understand to this question now. This is PostgreSQL pattern - cstring is
internal type only - used for in/out functions only

you can try

select * from pg_proc where 'cstring'::regtype::oid =
any(proargtypes::oid[]);

With cstring you are bypass PostgreSQL type system (cast from cstring and
to cstring is allowed for anytype) - so it should not be used in usual
functions.


>
>
> I think all the other claims below are mostly cosmetic. Main behavior
> is OK (considering its usage will not be in heavy queries).
>
> ===
> Documentation:
> Besides currently added row in a table I think it is worth to add
> detailed comment after a block "pg_size_pretty can be
> used" similar to (but the best way is to get advice for a correct
> phrase):
> "pg_size_bytes can be used to get a size in bytes as bigint from a
> human-readable format for a comparison purposes (it is the opposite to
> pg_size_pretty function). The format is numeric with an optional size
> unit: kB, MB, GB or TB."
> (and delete unit sizes from the table row).
>

fixed partially

>
> ===
> Tests:
> Since there was a letter with an explanation why units bigger than
> "TB" can't be used[2], it is a good reason to add that size units
> ("PB", "EB", "ZB") in tests with a note to update the documentation
> part when that unit are supported.
>

fixed


>
> ===
> Code style:
> 1. When you declare pointers, you must align _names_ by the left
> border, i.e. asterisks must be at one position to the left from the
> aligned names, i.e. one to three spaces instead of TAB before them.
>

fixed

>
> 2.
> > int   multiplier;
> One more TAB to align it with the name at the next line.
>

fixed

>
> ===
> Code:
>
> > /* allow whitespace between integer and unit */
> May be "numeric" instead of "integer"?
>

used "number"

>
> > "\"%s\" is not in expected format"
> It is not clear what format is expected.
>

I changed to string "invalid size: \"%s\"", but the final form of these
messages should be written by native speaker.


>
> > /* ignore plus symbol */
> It seems it is not ignored, but copied... ;-)
>

removed

>
> > to ger hintstr
> s/ger/get/
> > Elsewhere is used space trimmed buffer.
>


> I'd write it as "Otherwise trimmed value is used."
> I'm not good at English, but that full block looks little odd for me.
> I tried to reword it in the attachment single_buffer.c, but I don't
> think it is enough.
>
> > We allow ending spaces.
> "trailing"?
>
>
fixed


> > but this message can be little bit no intuitive - better text is "is not
> a valid number"
> It was a block with a comment "don't allow empty string", i.e. absence
> of a number...
>
> > Automatic memory deallocation doesn't cover all possible situations where
> > the function can be used - for example DirectFunctionCall - so explicit
> > deallocation can descrease a memory requirements when you call these
> > functions from C.
> DirectFunctionCall uses a memory context of the caller. For example,
> you don't free "num" allocated by numeric_in and numeric_mul, also
> mul_num allocated by int8_numeric...
> I'd ask experienced hackers for importance of freeing (or leaving)
> "str" and "buffer".
>
> > postgres=# select pg_size_bytes('+912+ kB');
> > ERROR:  invalid unit: "+ kB"
> > This is difficult - you have to divide string to two parts and first
> world is number, second world is unit.
> Your code looks like a duplicated (not by lines, but by behavior).
> numeric_in has similar checks, let it do them for you. The goal of
> your function is just split mantissa and size unit, and if the last
> one is found, turn it into an exponent.
> See my example in the attachment check_mantissa_by_numeric_in.c. There
> is just searching non-space char that can't be part of numeric. Then
> all before it passes to numeric_in and let it check anything it
> should. I even left first spaces to show full numeric part to a user
> if an error occurs (for any case).
> The second attachment single_buffer.c is an attempt to avoid
> allocating the second buffer (the first is "str" allocated by
> text_to_cstring) and copying into it. I don't think it gives a big
> improvement, but nevertheless.
>

I don't 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-21 Thread Vitaly Burovoy
On 1/20/16, Pavel Stehule  wrote:
> ...
> New version is attached
>
> Regards
> Pavel

I'm sorry I'll do a review only tonight.
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-21 Thread Pavel Stehule
2016-01-21 11:51 GMT+01:00 Vitaly Burovoy :

> On 1/20/16, Pavel Stehule  wrote:
> > ...
> > New version is attached
> >
> > Regards
> > Pavel
>
> I'm sorry I'll do a review only tonight.
>

ook :)

Thank you

Pavel



> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-21 Thread Pavel Stehule
2016-01-22 7:03 GMT+01:00 Vitaly Burovoy :

> On 1/20/16, Pavel Stehule  wrote:
> > ...
> > New version is attached
> >
> > Regards
> > Pavel
>
> Hello!
>
> 1. Why the function is marked as VOLATILE? It depends only on input
> value, it does change nothing in the DB.
> I guess it should be IMMUTABLE for efficient caching its result.
>
> 2.
> >   text*arg = PG_GETARG_TEXT_PP(0);
> ...
> >   str = text_to_cstring(arg);
> >
> Hmm... My question was _why_ you get TEXT and convert it instead of
> getting an argument of type cstring (not about usage of
> VARSIZE_ANY_EXHDR).
> It was enough to give me a link[1] and leave the usage of
> VARSIZE_ANY_EXHDR as is.
>
>
>
> I think all the other claims below are mostly cosmetic. Main behavior
> is OK (considering its usage will not be in heavy queries).
>
> ===
> Documentation:
> Besides currently added row in a table I think it is worth to add
> detailed comment after a block "pg_size_pretty can be
> used" similar to (but the best way is to get advice for a correct
> phrase):
> "pg_size_bytes can be used to get a size in bytes as bigint from a
> human-readable format for a comparison purposes (it is the opposite to
> pg_size_pretty function). The format is numeric with an optional size
> unit: kB, MB, GB or TB."
> (and delete unit sizes from the table row).
>
> ===
> Tests:
> Since there was a letter with an explanation why units bigger than
> "TB" can't be used[2], it is a good reason to add that size units
> ("PB", "EB", "ZB") in tests with a note to update the documentation
> part when that unit are supported.
>
> ===
> Code style:
> 1. When you declare pointers, you must align _names_ by the left
> border, i.e. asterisks must be at one position to the left from the
> aligned names, i.e. one to three spaces instead of TAB before them.
>
> 2.
> > int   multiplier;
> One more TAB to align it with the name at the next line.
>
> ===
> Code:
>
> > /* allow whitespace between integer and unit */
> May be "numeric" instead of "integer"?
>
> > "\"%s\" is not in expected format"
> It is not clear what format is expected.
>
> > /* ignore plus symbol */
> It seems it is not ignored, but copied... ;-)
>
> > to ger hintstr
> s/ger/get/
> > Elsewhere is used space trimmed buffer.
> I'd write it as "Otherwise trimmed value is used."
> I'm not good at English, but that full block looks little odd for me.
> I tried to reword it in the attachment single_buffer.c, but I don't
> think it is enough.
>
> > We allow ending spaces.
> "trailing"?
>
> > but this message can be little bit no intuitive - better text is "is not
> a valid number"
> It was a block with a comment "don't allow empty string", i.e. absence
> of a number...
>
> > Automatic memory deallocation doesn't cover all possible situations where
> > the function can be used - for example DirectFunctionCall - so explicit
> > deallocation can descrease a memory requirements when you call these
> > functions from C.
> DirectFunctionCall uses a memory context of the caller. For example,
> you don't free "num" allocated by numeric_in and numeric_mul, also
> mul_num allocated by int8_numeric...
> I'd ask experienced hackers for importance of freeing (or leaving)
> "str" and "buffer".
>

Sometimes, the memory can be released by caller only - or the memory usage
is pretty minimal. But when you can release, memory then you should to do
it to decrease push to memory allocator.



>
> > postgres=# select pg_size_bytes('+912+ kB');
> > ERROR:  invalid unit: "+ kB"
> > This is difficult - you have to divide string to two parts and first
> world is number, second world is unit.
> Your code looks like a duplicated (not by lines, but by behavior).
> numeric_in has similar checks, let it do them for you. The goal of
> your function is just split mantissa and size unit, and if the last
> one is found, turn it into an exponent.
> See my example in the attachment check_mantissa_by_numeric_in.c. There
> is just searching non-space char that can't be part of numeric. Then
> all before it passes to numeric_in and let it check anything it
> should. I even left first spaces to show full numeric part to a user
> if an error occurs (for any case).
> The second attachment single_buffer.c is an attempt to avoid
> allocating the second buffer (the first is "str" allocated by
> text_to_cstring) and copying into it. I don't think it gives a big
> improvement, but nevertheless.
>

This code is little bit more complex than it is necessary (but few lines
more) to produce readable error messages. I am thinking so it is correct.

I'll look on this patch next week.

Thank you for review

Regards

Pavel


>
> ===
> [1] http://www.postgresql.org/message-id/29618.1451882...@sss.pgh.pa.us
> [2]
> http://www.postgresql.org/message-id/CAB7nPqS6Wob4WnZb=dhb3o0pc-nx1v3xjszkn3z9kbexgcq...@mail.gmail.com
>
> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-21 Thread Vitaly Burovoy
On 1/20/16, Pavel Stehule  wrote:
> ...
> New version is attached
>
> Regards
> Pavel

Hello!

1. Why the function is marked as VOLATILE? It depends only on input
value, it does change nothing in the DB.
I guess it should be IMMUTABLE for efficient caching its result.

2.
>   text*arg = PG_GETARG_TEXT_PP(0);
...
>   str = text_to_cstring(arg);
>   
Hmm... My question was _why_ you get TEXT and convert it instead of
getting an argument of type cstring (not about usage of
VARSIZE_ANY_EXHDR).
It was enough to give me a link[1] and leave the usage of
VARSIZE_ANY_EXHDR as is.



I think all the other claims below are mostly cosmetic. Main behavior
is OK (considering its usage will not be in heavy queries).

===
Documentation:
Besides currently added row in a table I think it is worth to add
detailed comment after a block "pg_size_pretty can be
used" similar to (but the best way is to get advice for a correct
phrase):
"pg_size_bytes can be used to get a size in bytes as bigint from a
human-readable format for a comparison purposes (it is the opposite to
pg_size_pretty function). The format is numeric with an optional size
unit: kB, MB, GB or TB."
(and delete unit sizes from the table row).

===
Tests:
Since there was a letter with an explanation why units bigger than
"TB" can't be used[2], it is a good reason to add that size units
("PB", "EB", "ZB") in tests with a note to update the documentation
part when that unit are supported.

===
Code style:
1. When you declare pointers, you must align _names_ by the left
border, i.e. asterisks must be at one position to the left from the
aligned names, i.e. one to three spaces instead of TAB before them.

2.
> int   multiplier;
One more TAB to align it with the name at the next line.

===
Code:

> /* allow whitespace between integer and unit */
May be "numeric" instead of "integer"?

> "\"%s\" is not in expected format"
It is not clear what format is expected.

> /* ignore plus symbol */
It seems it is not ignored, but copied... ;-)

> to ger hintstr
s/ger/get/
> Elsewhere is used space trimmed buffer.
I'd write it as "Otherwise trimmed value is used."
I'm not good at English, but that full block looks little odd for me.
I tried to reword it in the attachment single_buffer.c, but I don't
think it is enough.

> We allow ending spaces.
"trailing"?

> but this message can be little bit no intuitive - better text is "is not a 
> valid number"
It was a block with a comment "don't allow empty string", i.e. absence
of a number...

> Automatic memory deallocation doesn't cover all possible situations where
> the function can be used - for example DirectFunctionCall - so explicit
> deallocation can descrease a memory requirements when you call these
> functions from C.
DirectFunctionCall uses a memory context of the caller. For example,
you don't free "num" allocated by numeric_in and numeric_mul, also
mul_num allocated by int8_numeric...
I'd ask experienced hackers for importance of freeing (or leaving)
"str" and "buffer".

> postgres=# select pg_size_bytes('+912+ kB');
> ERROR:  invalid unit: "+ kB"
> This is difficult - you have to divide string to two parts and first world is 
> number, second world is unit.
Your code looks like a duplicated (not by lines, but by behavior).
numeric_in has similar checks, let it do them for you. The goal of
your function is just split mantissa and size unit, and if the last
one is found, turn it into an exponent.
See my example in the attachment check_mantissa_by_numeric_in.c. There
is just searching non-space char that can't be part of numeric. Then
all before it passes to numeric_in and let it check anything it
should. I even left first spaces to show full numeric part to a user
if an error occurs (for any case).
The second attachment single_buffer.c is an attempt to avoid
allocating the second buffer (the first is "str" allocated by
text_to_cstring) and copying into it. I don't think it gives a big
improvement, but nevertheless.

===
[1] http://www.postgresql.org/message-id/29618.1451882...@sss.pgh.pa.us
[2] 
http://www.postgresql.org/message-id/CAB7nPqS6Wob4WnZb=dhb3o0pc-nx1v3xjszkn3z9kbexgcq...@mail.gmail.com

-- 
Best regards,
Vitaly Burovoy
/*
 * Convert human readable size to long int.
 *
 * Due to support for decimal values and case insensitivity of units
 * a function parse_int cannot be used.
 */
Datum
pg_size_bytes(PG_FUNCTION_ARGS)
{
	text	   *arg = PG_GETARG_TEXT_PP(0);
	char	   *str,
			   *strptr;
	char	   *buffer,
			   *bufptr;
	Numeric		num;
	int64		result;
	size_t		len;

	str = text_to_cstring(arg);
	strptr = str;

	/* Skip leading spaces */
	while (isspace((unsigned char) *strptr))
		strptr++;

	bufptr = strptr;
	for(;*strptr != '\0'; strptr++)
	{
		if (isdigit((unsigned char) *strptr) ||
			isspace((unsigned char) *strptr) ||
			*strptr == '.' || *strptr == '+' || *strptr == '-')
			continue;

		break;
	}

	/* don't allow empty string */
	if (bufptr == strptr)
		

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-20 Thread Pavel Stehule
2016-01-19 4:45 GMT+01:00 Vitaly Burovoy :

> On 1/4/16, Pavel Stehule  wrote:
> > 2016-01-04 18:13 GMT+01:00 Shulgin, Oleksandr <
> oleksandr.shul...@zalando.de> :
> >> On Mon, Jan 4, 2016 at 6:03 PM, Pavel Stehule 
> wrote:
> >> > 2016-01-04 17:48 GMT+01:00 Shulgin, Oleksandr <
> oleksandr.shul...@zalando.de>:
> >> >> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas 
> wrote:
> >> >>
> >> >> I'm also inclined on dropping that explicit check for empty string
> >> >> below and let numeric_in() error out on that.  Does this look OK, or
> can
> >> >> it confuse someone:
> >> >> postgres=# select pg_size_bytes('');
> >> >> ERROR:  invalid input syntax for type numeric: ""
> >> >
> >> > both fixed
> >>
> >> Hm...
> >>
> >> > + switch (*strptr)
> >> > + {
> >> > + /* ignore plus symbol */
> >> > + case '+':
> >> > + case '-':
> >> > + *bufptr++ = *strptr++;
> >> > + break;
> >> > + }
> >>
> >> Well, to that amount you don't need any special checks, I'm just not
> sure
> >> if reported error message is not misleading if we let numeric_in()
> handle
> >> all the errors.  At least it can cope with the leading spaces, +/- and
> >> empty input quite well.
> >>
> >
> > I don't would to catch a exception from numeric_in - so I try to solve
> some
> > simple situations, where I can raise possible better error message.
>
> There are several cases where your behavior gives strange errors (see
> below).
>
> Next batch of notes:
>
> src/include/catalog/pg_proc.h:
> ---
> + DATA(insert OID = 3317 ( pg_size_bytes...
> now oid 3317 is used (by pg_stat_get_wal_receiver), 3318 is free
>

fixed


>
> ---
> + DESCR("convert a human readable text with size units to big int bytes");
> May be the best way is to copy the first sentence from the doc?
> ("convert a size in human-readable format with size units into bytes")
>

fixed


> 
> src/backend/utils/adt/dbsize.c:
> + text *arg = PG_GETARG_TEXT_PP(0);
> + char *str = text_to_cstring(arg);
> ...
> +   /* working buffer cannot be longer than original string */
> +   buffer = (char *) palloc(VARSIZE_ANY_EXHDR(arg) + 1);
> Is there any reason to get TEXT for only converting it to cstring and
> call VARSIZE_ANY_EXHDR instead of strlen?
>

performance - but these strings should be short, so I can use strlen

fixed


>
> ---
> +   text   *arg = PG_GETARG_TEXT_PP(0);
> +   char   *str = text_to_cstring(arg);
> +   char*strptr = str;
> +   char   *buffer;
> There are wrong offsets of variable names after their types (among all
> body of the "pg_size_bytes" function).
> See variable declarations in nearby functions (
>   >> "make the new code look like the existing code around it"
>   http://www.postgresql.org/docs/devel/static/source-format.html
> )
>
>
fixed


> ---
> +errmsg("\"%s\" is not number",
> str)));
> s/is not number/is not a number/
> (the second version can be found in a couple places besides translations)
>

fixed

but this message can be little bit no intuitive - better text is "is not a
valid number"


>
> ---
> +   if (*strptr != '\0')
> ...
> +   while (*strptr && !isspace(*strptr))
> Sometimes it explicitly compares to '\0', sometimes implicitly.
> Common use is explicit comparison and it is preferred due to different
> compilers (their conversions to boolean).
>

fixed

>
> ---
> +   /* Skip leading spaces */
> ...
> +   /* ignore plus symbol */
> ...
> +   /* copy digits to working buffer */
> ...
> +   /* allow whitespace between integer and unit */
> I'm also inclined on dropping that explicit skipping spaces, checking
> for +/- symbols, but copying all digits, spaces, dots and '+-' symbols
> and let numeric_in() error out on that.
>

This is difficult - you have to divide string to two parts and first world
is number, second world is unit.

For example "+912+ kB" is correct number +912 and broken unit "+ kB".


> It allows to get correct error messages for something like:
> postgres=# select pg_size_bytes('.+912');
> ERROR:  invalid input syntax for type numeric: ".+912"
> postgres=# select pg_size_bytes('+912+ kB');
> ERROR:  invalid input syntax for type numeric: "+912+ "
> postgres=# select pg_size_bytes('++123 kB');
> ERROR:  invalid input syntax for type numeric: "++123 "
>
> instead of current:
> postgres=# select pg_size_bytes('.+912');
> ERROR:  invalid input syntax for type numeric: "."
> postgres=# select pg_size_bytes('+912+ kB');
> ERROR:  invalid unit: "+ kB"
> postgres=# select pg_size_bytes('++123 kB');
> ERROR:  invalid input syntax for type numeric: "+"
>
>
I redesigned this check. Now it is

popostgres=# select pg_size_bytes('.+912');
ERROR:  22023: ".+912" is not a valid number
stgres=# select pg_size_bytes('++123 kB');
ERROR:  22023: "++123 kB" is not a valid number



> ---
> +  

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-20 Thread Pavel Stehule
Hi

2016-01-19 0:56 GMT+01:00 Vitaly Burovoy :

> On 1/4/16, Robert Haas  wrote:
> > On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
> > wrote:
> >> [ new patch ]
> >
> > + case '-':
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +  errmsg("size cannot be negative")));
> >
> > Why not?  I bet if you copy any - sign to the buffer, this will Just
> Work.
>
> Hmm. The function's name is pg_size_bytes. How number of bytes can be
> negative? How any length can be negative? If anyone insert '-' sign to
> an argument, it is copy-paste error. I don't see any case where there
> is possible negatives as input value.
>
> I prefer error message instead of getting all relations (by using
> comparison from the initial letter) just because of copy-paste mistake
> or incomplete checking of input values at app-level.
>

the last version of this patch support negative numbers.

Regards

Pavel


>
> > ...
> >
> > --
> > Robert Haas
> > EnterpriseDB: http://www.enterprisedb.com
> > The Enterprise PostgreSQL Company
>
> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-19 Thread Robert Haas
On Mon, Jan 18, 2016 at 6:56 PM, Vitaly Burovoy
 wrote:
> On 1/4/16, Robert Haas  wrote:
>> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
>> wrote:
>>> [ new patch ]
>>
>> + case '-':
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> +  errmsg("size cannot be negative")));
>>
>> Why not?  I bet if you copy any - sign to the buffer, this will Just Work.
>
> Hmm. The function's name is pg_size_bytes. How number of bytes can be
> negative? How any length can be negative? If anyone insert '-' sign to
> an argument, it is copy-paste error. I don't see any case where there
> is possible negatives as input value.
>
> I prefer error message instead of getting all relations (by using
> comparison from the initial letter) just because of copy-paste mistake
> or incomplete checking of input values at app-level.

Well, we just recently did this:

commit 8a1fab36aba7506fcf4559c4ef95fcacdd0b439a
Author: Robert Haas 
Date:   Fri Nov 6 11:03:02 2015 -0500

pg_size_pretty: Format negative values similar to positive ones.

Previously, negative values were always displayed in bytes, regardless
of how large they were.

Adrian Vondendriesch, reviewed by Julien Rouhaud and myself

Since we went to the trouble of making the handling of positive and
negative values symmetric for that function, it seems it should be
done here also.  Otherwise, it is asymmetric.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-19 Thread Pavel Stehule
2016-01-19 13:42 GMT+01:00 Robert Haas :

> On Mon, Jan 18, 2016 at 6:56 PM, Vitaly Burovoy
>  wrote:
> > On 1/4/16, Robert Haas  wrote:
> >> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule  >
> >> wrote:
> >>> [ new patch ]
> >>
> >> + case '-':
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >> +  errmsg("size cannot be negative")));
> >>
> >> Why not?  I bet if you copy any - sign to the buffer, this will Just
> Work.
> >
> > Hmm. The function's name is pg_size_bytes. How number of bytes can be
> > negative? How any length can be negative? If anyone insert '-' sign to
> > an argument, it is copy-paste error. I don't see any case where there
> > is possible negatives as input value.
> >
> > I prefer error message instead of getting all relations (by using
> > comparison from the initial letter) just because of copy-paste mistake
> > or incomplete checking of input values at app-level.
>
> Well, we just recently did this:
>
> commit 8a1fab36aba7506fcf4559c4ef95fcacdd0b439a
> Author: Robert Haas 
> Date:   Fri Nov 6 11:03:02 2015 -0500
>
> pg_size_pretty: Format negative values similar to positive ones.
>
> Previously, negative values were always displayed in bytes, regardless
> of how large they were.
>
> Adrian Vondendriesch, reviewed by Julien Rouhaud and myself
>
> Since we went to the trouble of making the handling of positive and
> negative values symmetric for that function, it seems it should be
> done here also.  Otherwise, it is asymmetric.
>

the last patch (pg-size-bytes-08.patch
)
at 2016-01-04 17:03:02

allows negative size

Regards

Pavel



> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-18 Thread Vitaly Burovoy
On 1/4/16, Robert Haas  wrote:
> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
> wrote:
>> [ new patch ]
>
> + case '-':
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +  errmsg("size cannot be negative")));
>
> Why not?  I bet if you copy any - sign to the buffer, this will Just Work.

Hmm. The function's name is pg_size_bytes. How number of bytes can be
negative? How any length can be negative? If anyone insert '-' sign to
an argument, it is copy-paste error. I don't see any case where there
is possible negatives as input value.

I prefer error message instead of getting all relations (by using
comparison from the initial letter) just because of copy-paste mistake
or incomplete checking of input values at app-level.

> ...
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-18 Thread Vitaly Burovoy
On 1/4/16, Pavel Stehule  wrote:
> 2016-01-04 18:13 GMT+01:00 Shulgin, Oleksandr  :
>> On Mon, Jan 4, 2016 at 6:03 PM, Pavel Stehule  
>> wrote:
>> > 2016-01-04 17:48 GMT+01:00 Shulgin, Oleksandr 
>> > :
>> >> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas  wrote:
>> >>
>> >> I'm also inclined on dropping that explicit check for empty string
>> >> below and let numeric_in() error out on that.  Does this look OK, or can
>> >> it confuse someone:
>> >> postgres=# select pg_size_bytes('');
>> >> ERROR:  invalid input syntax for type numeric: ""
>> >
>> > both fixed
>>
>> Hm...
>>
>> > + switch (*strptr)
>> > + {
>> > + /* ignore plus symbol */
>> > + case '+':
>> > + case '-':
>> > + *bufptr++ = *strptr++;
>> > + break;
>> > + }
>>
>> Well, to that amount you don't need any special checks, I'm just not sure
>> if reported error message is not misleading if we let numeric_in() handle
>> all the errors.  At least it can cope with the leading spaces, +/- and
>> empty input quite well.
>>
>
> I don't would to catch a exception from numeric_in - so I try to solve some
> simple situations, where I can raise possible better error message.

There are several cases where your behavior gives strange errors (see below).

Next batch of notes:

src/include/catalog/pg_proc.h:
---
+ DATA(insert OID = 3317 ( pg_size_bytes...
now oid 3317 is used (by pg_stat_get_wal_receiver), 3318 is free

---
+ DESCR("convert a human readable text with size units to big int bytes");
May be the best way is to copy the first sentence from the doc?
("convert a size in human-readable format with size units into bytes")


src/backend/utils/adt/dbsize.c:
+ text *arg = PG_GETARG_TEXT_PP(0);
+ char *str = text_to_cstring(arg);
...
+   /* working buffer cannot be longer than original string */
+   buffer = (char *) palloc(VARSIZE_ANY_EXHDR(arg) + 1);
Is there any reason to get TEXT for only converting it to cstring and
call VARSIZE_ANY_EXHDR instead of strlen?

---
+   text   *arg = PG_GETARG_TEXT_PP(0);
+   char   *str = text_to_cstring(arg);
+   char*strptr = str;
+   char   *buffer;
There are wrong offsets of variable names after their types (among all
body of the "pg_size_bytes" function).
See variable declarations in nearby functions (
  >> "make the new code look like the existing code around it"
  http://www.postgresql.org/docs/devel/static/source-format.html
)

---
+errmsg("\"%s\" is not number", str)));
s/is not number/is not a number/
(the second version can be found in a couple places besides translations)

---
+   if (*strptr != '\0')
...
+   while (*strptr && !isspace(*strptr))
Sometimes it explicitly compares to '\0', sometimes implicitly.
Common use is explicit comparison and it is preferred due to different
compilers (their conversions to boolean).

---
+   /* Skip leading spaces */
...
+   /* ignore plus symbol */
...
+   /* copy digits to working buffer */
...
+   /* allow whitespace between integer and unit */
I'm also inclined on dropping that explicit skipping spaces, checking
for +/- symbols, but copying all digits, spaces, dots and '+-' symbols
and let numeric_in() error out on that.

It allows to get correct error messages for something like:
postgres=# select pg_size_bytes('.+912');
ERROR:  invalid input syntax for type numeric: ".+912"
postgres=# select pg_size_bytes('+912+ kB');
ERROR:  invalid input syntax for type numeric: "+912+ "
postgres=# select pg_size_bytes('++123 kB');
ERROR:  invalid input syntax for type numeric: "++123 "

instead of current:
postgres=# select pg_size_bytes('.+912');
ERROR:  invalid input syntax for type numeric: "."
postgres=# select pg_size_bytes('+912+ kB');
ERROR:  invalid unit: "+ kB"
postgres=# select pg_size_bytes('++123 kB');
ERROR:  invalid input syntax for type numeric: "+"

---
+   while (isspace((unsigned char) *strptr))
...
+   while (isspace(*strptr))
...
+   while (*strptr && !isspace(*strptr))
...
+   while (isspace(*strptr))
The first occurece of isspace's parameter is casting to "unsigned
char" whereas the others are not.
Note:
"The behavior is undefined if the value of ch is not representable as
unsigned char and is not equal to EOF"
Proof:
http://en.cppreference.com/w/c/string/byte/isspace

---
+   pfree(buffer);
+   pfree(str);
pfree-s here are not necessary. See:
http://www.neilconway.org/talks/hacking/hack_slides.pdf (page 17)

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-18 Thread Vitaly Burovoy
On 1/18/16, Vitaly Burovoy  wrote:
> <>
> ---
> + if (*strptr != '\0')
> ...
> + while (*strptr && !isspace(*strptr))
> Sometimes it explicitly compares to '\0', sometimes implicitly.
> Common use is explicit comparison and it is preferred due to different
> compilers (their conversions to boolean).
>
> ---
> <>

It seems I distracted on something else... That lines are ok, skip that block.

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Alvaro Herrera
Robert Haas wrote:

> But you could also write SELECT relname FROM pg_class WHERE
> pg_relation_size(oid) > 100 * 1024^3, which is actually fewer
> characters.  Maybe pg_size_bytes('100 GB') is easier for some people
> to remember than 100 * 1024^3, but I'm probably not one of those
> people.

Nah, that might work for geek types, but I doubt it's the preferred
spelling for most people.  I think the proposal is quite reasonable.

If we were only catering for people who can do 2^10 arithmetic off the
top of their heads, we wouldn't have pg_size_pretty at all, would we?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Pavel Stehule
2016-01-04 21:29 GMT+01:00 Robert Haas :

> On Mon, Jan 4, 2016 at 12:17 PM, Shulgin, Oleksandr
>  wrote:
> >> I'm also kind of wondering what the intended use case for this
> >> function is.  Why do we want it?  Do we want it?
> >
> > As suggested above a usecase could be like the following:
> >
> > SELECT relname FROM pg_class WHERE pg_relation_size(oid) >
> > pg_size_bytes('100 GB');
> >
> > I think it's neat and useful.
>
> But you could also write SELECT relname FROM pg_class WHERE
> pg_relation_size(oid) > 100 * 1024^3, which is actually fewer
> characters.  Maybe pg_size_bytes('100 GB') is easier for some people
> to remember than 100 * 1024^3, but I'm probably not one of those
> people.
>

I am really one who hasn't memory for numbers - "100GB" is much more
verbose for me. It is more clean in more complex maintenance queries. And
if you need short syntax, you can write own SQL function

CREATE OR REPLACE FUNCTION size(text) RETURNS bigint AS $$ SELECT
pg_size_bytes($1) $$ LANGUAGE sql;

then ... pg_relation_size(oid) > size('100GB')

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Robert Haas
On Mon, Jan 4, 2016 at 3:48 PM, Alvaro Herrera  wrote:
> Robert Haas wrote:
>
>> But you could also write SELECT relname FROM pg_class WHERE
>> pg_relation_size(oid) > 100 * 1024^3, which is actually fewer
>> characters.  Maybe pg_size_bytes('100 GB') is easier for some people
>> to remember than 100 * 1024^3, but I'm probably not one of those
>> people.
>
> Nah, that might work for geek types, but I doubt it's the preferred
> spelling for most people.  I think the proposal is quite reasonable.
>
> If we were only catering for people who can do 2^10 arithmetic off the
> top of their heads, we wouldn't have pg_size_pretty at all, would we?

Well, I don't know what 100 * 1024^3 is off the top of my head, but I
know that I can compute it by typing exactly that.  So I think one
direction is easier than the other.  However, IJWH.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Robert Haas
On Mon, Jan 4, 2016 at 12:17 PM, Shulgin, Oleksandr
 wrote:
>> I'm also kind of wondering what the intended use case for this
>> function is.  Why do we want it?  Do we want it?
>
> As suggested above a usecase could be like the following:
>
> SELECT relname FROM pg_class WHERE pg_relation_size(oid) >
> pg_size_bytes('100 GB');
>
> I think it's neat and useful.

But you could also write SELECT relname FROM pg_class WHERE
pg_relation_size(oid) > 100 * 1024^3, which is actually fewer
characters.  Maybe pg_size_bytes('100 GB') is easier for some people
to remember than 100 * 1024^3, but I'm probably not one of those
people.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Shulgin, Oleksandr
On Wed, Dec 30, 2015 at 8:28 PM, Pavel Stehule 
wrote:

>
>
> 2015-12-30 17:33 GMT+01:00 Robert Haas :
>
>> On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
>>  wrote:
>> > I didn't check out earlier versions of this patch, but the latest one
>> still
>> > changes pg_size_pretty() to emit PB suffix.
>> >
>> > I don't think it is worth it to throw a number of changes together like
>> > that.  We should focus on adding pg_size_bytes() first and make it
>> > compatible with both pg_size_pretty() and existing GUC units: that is
>> > support suffixes up to TB and make sure they have the meaning of powers
>> of
>> > 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>> >
>> > Next, we could think about adding handling of PB suffix on input and
>> output,
>> > but I don't see a big problem if that is emitted as 1024TB or the user
>> has
>> > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an
>> minor
>> > inconvenience only.
>>
>> +1 to everything in this email.
>>
>
> so I removed support for PB and SI units. Now the
> memory_unit_conversion_table is shared.
>

Looks better, thanks.

I'm not sure why the need to touch the regression test for pg_size_pretty():

!10.5 | 10.5 bytes | -10.5 bytes
!  1000.5 | 1000.5 bytes   | -1000.5 bytes
!   100.5 | 977 kB | -977 kB
!10.5 | 954 MB | -954 MB
! 1.5 | 931 GB | -931 GB
!  1000.5 | 909 TB | -909 TB

A nitpick, this loop:

+ while (*cp)
+ {
+ if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
+ digits[ndigits++] = *cp++;
+ else
+ break;
+ }

would be a bit easier to parse if spelled as:

+ while (*cp && (isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
+ digits[ndigits++] = *cp++;

On the other hand, this seems to truncate the digits silently:

+ digits[ndigits] = '\0';

I don't think we want that, e.g:

postgres=# select pg_size_bytes('9223372036854775807.9');
ERROR:  invalid unit "9"
HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".

I think making a mutable copy of the input string and truncating it before
passing to numeric_in() would make more sense--no need to hard-code
MAX_DIGITS.  The same goes for hard-coding MAX_UNIT_LEN, e.g. compare the
following two outputs:

postgres=# select pg_size_bytes('1 KiB');
ERROR:  invalid unit "KiB"
HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".

postgres=# select pg_size_bytes('1024 bytes');
ERROR:  invalid format

I believe we should see a similar error message and a hint in the latter
case.  (No, I don't think we should add support for 'bytes' as a unit, not
even for "compatibility" with pg_size_pretty()--for one, I don't think it
would be wise to expect pg_size_bytes() to be able to deparse *every*
possible output produced by pg_size_pretty() as it's purpose is
human-readable display; also, pg_size_pretty() can easily produce output
that doesn't fit into bigint type, or is just negative)

Code comments and doc change need proof-reading by a native English
speaker, which I am not.

--
Alex


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Pavel Stehule
Hi

2016-01-04 12:46 GMT+01:00 Shulgin, Oleksandr 
:

> On Wed, Dec 30, 2015 at 8:28 PM, Pavel Stehule 
> wrote:
>
>>
>>
>> 2015-12-30 17:33 GMT+01:00 Robert Haas :
>>
>>> On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
>>>  wrote:
>>> > I didn't check out earlier versions of this patch, but the latest one
>>> still
>>> > changes pg_size_pretty() to emit PB suffix.
>>> >
>>> > I don't think it is worth it to throw a number of changes together like
>>> > that.  We should focus on adding pg_size_bytes() first and make it
>>> > compatible with both pg_size_pretty() and existing GUC units: that is
>>> > support suffixes up to TB and make sure they have the meaning of
>>> powers of
>>> > 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>>> >
>>> > Next, we could think about adding handling of PB suffix on input and
>>> output,
>>> > but I don't see a big problem if that is emitted as 1024TB or the user
>>> has
>>> > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an
>>> minor
>>> > inconvenience only.
>>>
>>> +1 to everything in this email.
>>>
>>
>> so I removed support for PB and SI units. Now the
>> memory_unit_conversion_table is shared.
>>
>
> Looks better, thanks.
>
> I'm not sure why the need to touch the regression test for
> pg_size_pretty():
>
> !10.5 | 10.5 bytes | -10.5 bytes
> !  1000.5 | 1000.5 bytes   | -1000.5 bytes
> !   100.5 | 977 kB | -977 kB
> !10.5 | 954 MB | -954 MB
> ! 1.5 | 931 GB | -931 GB
> !  1000.5 | 909 TB | -909 TB
>
>
fixed


> A nitpick, this loop:
>
> + while (*cp)
> + {
> + if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
> + digits[ndigits++] = *cp++;
> + else
> + break;
> + }
>
> would be a bit easier to parse if spelled as:
>
> + while (*cp && (isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
> + digits[ndigits++] = *cp++;
>

fixed


>
> On the other hand, this seems to truncate the digits silently:
>
> + digits[ndigits] = '\0';
>
> I don't think we want that, e.g:
>
> postgres=# select pg_size_bytes('9223372036854775807.9');
> ERROR:  invalid unit "9"
> HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".
>
> I think making a mutable copy of the input string and truncating it before
> passing to numeric_in() would make more sense--no need to hard-code
> MAX_DIGITS.  The same goes for hard-coding MAX_UNIT_LEN, e.g. compare the
> following two outputs:
>
> postgres=# select pg_size_bytes('1 KiB');
> ERROR:  invalid unit "KiB"
> HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".
>
> postgres=# select pg_size_bytes('1024 bytes');
> ERROR:  invalid format
>
>
fixed


> I believe we should see a similar error message and a hint in the latter
> case.  (No, I don't think we should add support for 'bytes' as a unit, not
> even for "compatibility" with pg_size_pretty()--for one, I don't think it
> would be wise to expect pg_size_bytes() to be able to deparse *every*
> possible output produced by pg_size_pretty() as it's purpose is
> human-readable display; also, pg_size_pretty() can easily produce output
> that doesn't fit into bigint type, or is just negative)
>
> Code comments and doc change need proof-reading by a native English
> speaker, which I am not.
>


Regards

Pavel


>
> --
> Alex
>
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 8ef9fce..6b921ae
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17696 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  into bytes.  The parameter is case insensitive string. Following
+  units are supported: kB, MB, GB, TB.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 2084692..ce97467
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,30 
--- 25,31 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 701,811 
  }
  
  /*
+  * Convert human readable size to long int.
+  *
+  * Due suppor decimal value and case insensitivity of units
+  * a 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Robert Haas
On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule  wrote:
> [ new patch ]

+ case '-':
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+  errmsg("size cannot be negative")));

Why not?  I bet if you copy any - sign to the buffer, this will Just Work.

+ if ( conv->base_unit == GUC_UNIT_KB &&

Whitespace.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Pavel Stehule
2016-01-04 16:51 GMT+01:00 Robert Haas :

> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
> wrote:
> > [ new patch ]
>
> + case '-':
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +  errmsg("size cannot be negative")));
>
> Why not?  I bet if you copy any - sign to the buffer, this will Just Work.
>

true, fixed


>
> + if ( conv->base_unit == GUC_UNIT_KB &&
>
> Whitespace.
>

I don't see it ??

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 8ef9fce..6b921ae
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17696 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  into bytes.  The parameter is case insensitive string. Following
+  units are supported: kB, MB, GB, TB.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 2084692..0b7af3c
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,30 
--- 25,31 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 701,808 
  }
  
  /*
+  * Convert human readable size to long int.
+  *
+  * Due suppor decimal value and case insensitivity of units
+  * a function parse_intcannot be used.
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text		   *arg = PG_GETARG_TEXT_PP(0);
+ 	char		   *str = text_to_cstring(arg);
+ 	char	*strptr = str;
+ 	char		   *buffer;
+ 	char	*bufptr;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* working buffer cannot be longer than original string */
+ 	buffer = (char *) palloc(VARSIZE_ANY_EXHDR(arg) + 1);
+ 	bufptr = buffer;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *strptr))
+ 		strptr++;
+ 
+ 	switch (*strptr)
+ 	{
+ 		/* ignore plus symbol */
+ 		case '+':
+ 		case '-':
+ 			*bufptr++ = *strptr++;
+ 			break;
+ 	}
+ 
+ 	/* copy digits to working buffer */
+ 	while (*strptr && (isdigit(*strptr) || *strptr == '.'))
+ 		*bufptr++ = *strptr++;
+ 	*bufptr = '\0';
+ 
+ 	/* don't allow empty string */
+ 	if (*buffer == '\0')
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("\"%s\" is not number", str)));
+ 
+ 	num = DatumGetNumeric(DirectFunctionCall3(numeric_in,
+ 			CStringGetDatum(buffer), 0, -1));
+ 
+ 	/* allow whitespace between integer and unit */
+ 	while (isspace(*strptr))
+ 		strptr++;
+ 
+ 	/* Handle possible unit */
+ 	if (*strptr != '\0')
+ 	{
+ 		int		multiplier;
+ 		Numeric			mul_num;
+ 		const char	*hintmsg;
+ 		const char *unitstr = strptr;
+ 
+ 		bufptr = buffer;
+ 
+ 		/* copy chars to buffer and stop on space */
+ 		while (*strptr && !isspace(*strptr))
+ 			*bufptr++ = *strptr++;
+ 		*bufptr = '\0';
+ 
+ 		/*
+ 		 * Use buffer as unit if there are not any nonspace char,
+ 		 * else use a original unit string.
+ 		 */
+ 		while (isspace(*strptr))
+ 			strptr++;
+ 		if (*strptr == '\0')
+ 			unitstr = buffer;
+ 
+ 		if (!parse_memory_unit(unitstr, , ))
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid unit: \"%s\"", unitstr),
+ 	 errhint("%s", _(hintmsg;
+ 
+ 		/*
+ 		 * Now, the multiplier is in KB unit. It should be multiplied by 1024
+ 		 * before usage
+ 		 */
+ 		mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric,
+ 		Int64GetDatum(multiplier * 1024L)));
+ 
+ 		num = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+ 			NumericGetDatum(mul_num),
+ 			NumericGetDatum(num)));
+ 	}
+ 
+ 	result = DatumGetInt64(DirectFunctionCall1(numeric_int8, NumericGetDatum(num)));
+ 
+ 	pfree(buffer);
+ 	pfree(str);
+ 
+ 	PG_RETURN_INT64(result);
+ }
+ 
+ /*
   * Get the filenode of a relation
   *
   * This is expected to be used in queries like
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
new file mode 100644
index 38ba82f..00021fd
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
*** convert_from_base_unit(int64 base_value,
*** 5238,5243 
--- 5238,5272 
  
  
  /*
+  * Parse value as some known 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Shulgin, Oleksandr
On Mon, Jan 4, 2016 at 6:03 PM, Pavel Stehule 
wrote:
>
> 2016-01-04 17:48 GMT+01:00 Shulgin, Oleksandr <
oleksandr.shul...@zalando.de>:
>>
>> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas 
wrote:
>>>
>>> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
wrote:
>>> > [ new patch ]
>>>
>>> + case '-':
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>> +  errmsg("size cannot be negative")));
>>>
>>> Why not?  I bet if you copy any - sign to the buffer, this will Just
Work.
>>
>>
>> I'm also inclined on dropping that explicit check for empty string below
and let numeric_in() error out on that.  Does this look OK, or can it
confuse someone:
>>
>> postgres=# select pg_size_bytes('');
>> ERROR:  invalid input syntax for type numeric: ""
>>
>> ?
>>
>>> + if ( conv->base_unit == GUC_UNIT_KB &&
>>
>>
>> Between "(" and "conv->..." I believe.
>
>
> both fixed

Hm...

> + switch (*strptr)
> + {
> + /* ignore plus symbol */
> + case '+':
> + case '-':
> + *bufptr++ = *strptr++;
> + break;
> + }

Well, to that amount you don't need any special checks, I'm just not sure
if reported error message is not misleading if we let numeric_in() handle
all the errors.  At least it can cope with the leading spaces, +/- and
empty input quite well.

--
Alex


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Robert Haas
On Mon, Jan 4, 2016 at 11:48 AM, Shulgin, Oleksandr
 wrote:
> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas  wrote:
>>
>> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
>> wrote:
>> > [ new patch ]
>>
>> + case '-':
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> +  errmsg("size cannot be negative")));
>>
>> Why not?  I bet if you copy any - sign to the buffer, this will Just Work.
>
>
> I'm also inclined on dropping that explicit check for empty string below and
> let numeric_in() error out on that.  Does this look OK, or can it confuse
> someone:
>
> postgres=# select pg_size_bytes('');
> ERROR:  invalid input syntax for type numeric: ""

I think that's a pretty bad error message.  I mean, the user is
calling a function that takes text as an input data type.  So, where's
numeric involved?

I'm also kind of wondering what the intended use case for this
function is.  Why do we want it?  Do we want it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Shulgin, Oleksandr
On Mon, Jan 4, 2016 at 6:14 PM, Robert Haas  wrote:

> On Mon, Jan 4, 2016 at 11:48 AM, Shulgin, Oleksandr
> >
> > postgres=# select pg_size_bytes('');
> > ERROR:  invalid input syntax for type numeric: ""
>
> I think that's a pretty bad error message.  I mean, the user is
> calling a function that takes text as an input data type.  So, where's
> numeric involved?
>

Is there a way to force CONTEXT output in the reported error?  I guess that
could help.

I'm also kind of wondering what the intended use case for this
> function is.  Why do we want it?  Do we want it?
>

As suggested above a usecase could be like the following:

SELECT relname FROM pg_class WHERE pg_relation_size(oid) >
pg_size_bytes('100 GB');

I think it's neat and useful.

--
Alex


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Pavel Stehule
2016-01-04 18:14 GMT+01:00 Robert Haas :

> On Mon, Jan 4, 2016 at 11:48 AM, Shulgin, Oleksandr
>  wrote:
> > On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas 
> wrote:
> >>
> >> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule  >
> >> wrote:
> >> > [ new patch ]
> >>
> >> + case '-':
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >> +  errmsg("size cannot be negative")));
> >>
> >> Why not?  I bet if you copy any - sign to the buffer, this will Just
> Work.
> >
> >
> > I'm also inclined on dropping that explicit check for empty string below
> and
> > let numeric_in() error out on that.  Does this look OK, or can it confuse
> > someone:
> >
> > postgres=# select pg_size_bytes('');
> > ERROR:  invalid input syntax for type numeric: ""
>
> I think that's a pretty bad error message.  I mean, the user is
> calling a function that takes text as an input data type.  So, where's
> numeric involved?
>

last version is little bit better

postgres=# select pg_size_bytes('');
ERROR:  22023: "" is not number



>
> I'm also kind of wondering what the intended use case for this
> function is.  Why do we want it?  Do we want it?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Pavel Stehule
2016-01-04 18:13 GMT+01:00 Shulgin, Oleksandr 
:

> On Mon, Jan 4, 2016 at 6:03 PM, Pavel Stehule 
> wrote:
> >
> > 2016-01-04 17:48 GMT+01:00 Shulgin, Oleksandr <
> oleksandr.shul...@zalando.de>:
> >>
> >> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas 
> wrote:
> >>>
> >>> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule <
> pavel.steh...@gmail.com> wrote:
> >>> > [ new patch ]
> >>>
> >>> + case '-':
> >>> + ereport(ERROR,
> >>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >>> +  errmsg("size cannot be negative")));
> >>>
> >>> Why not?  I bet if you copy any - sign to the buffer, this will Just
> Work.
> >>
> >>
> >> I'm also inclined on dropping that explicit check for empty string
> below and let numeric_in() error out on that.  Does this look OK, or can it
> confuse someone:
>
>>
> >> postgres=# select pg_size_bytes('');
> >> ERROR:  invalid input syntax for type numeric: ""
> >>
> >> ?
> >>
> >>> + if ( conv->base_unit == GUC_UNIT_KB &&
> >>
> >>
> >> Between "(" and "conv->..." I believe.
> >
> >
> > both fixed
>
> Hm...
>
> > + switch (*strptr)
> > + {
> > + /* ignore plus symbol */
> > + case '+':
> > + case '-':
> > + *bufptr++ = *strptr++;
> > + break;
> > + }
>
> Well, to that amount you don't need any special checks, I'm just not sure
> if reported error message is not misleading if we let numeric_in() handle
> all the errors.  At least it can cope with the leading spaces, +/- and
> empty input quite well.
>

I don't would to catch a exception from numeric_in - so I try to solve some
simple situations, where I can raise possible better error message.

Regards

Pavel


>
> --
> Alex
>
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Thom Brown
On 4 January 2016 at 15:17, Pavel Stehule  wrote:
> Hi
>
> 2016-01-04 12:46 GMT+01:00 Shulgin, Oleksandr
> :
>>
>> On Wed, Dec 30, 2015 at 8:28 PM, Pavel Stehule 
>> wrote:
>>>
>>>
>>>
>>> 2015-12-30 17:33 GMT+01:00 Robert Haas :

 On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
  wrote:
 > I didn't check out earlier versions of this patch, but the latest one
 > still
 > changes pg_size_pretty() to emit PB suffix.
 >
 > I don't think it is worth it to throw a number of changes together
 > like
 > that.  We should focus on adding pg_size_bytes() first and make it
 > compatible with both pg_size_pretty() and existing GUC units: that is
 > support suffixes up to TB and make sure they have the meaning of
 > powers of
 > 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
 >
 > Next, we could think about adding handling of PB suffix on input and
 > output,
 > but I don't see a big problem if that is emitted as 1024TB or the user
 > has
 > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an
 > minor
 > inconvenience only.

 +1 to everything in this email.
>>>
>>>
>>> so I removed support for PB and SI units. Now the
>>> memory_unit_conversion_table is shared.
>>
>>
>> Looks better, thanks.
>>
>> I'm not sure why the need to touch the regression test for
>> pg_size_pretty():
>>
>> !10.5 | 10.5 bytes | -10.5 bytes
>> !  1000.5 | 1000.5 bytes   | -1000.5 bytes
>> !   100.5 | 977 kB | -977 kB
>> !10.5 | 954 MB | -954 MB
>> ! 1.5 | 931 GB | -931 GB
>> !  1000.5 | 909 TB | -909 TB
>>
>
> fixed
>
>>
>> A nitpick, this loop:
>>
>> + while (*cp)
>> + {
>> + if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
>> + digits[ndigits++] = *cp++;
>> + else
>> + break;
>> + }
>>
>> would be a bit easier to parse if spelled as:
>>
>> + while (*cp && (isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
>> + digits[ndigits++] = *cp++;
>
>
> fixed
>
>>
>>
>> On the other hand, this seems to truncate the digits silently:
>>
>> + digits[ndigits] = '\0';
>>
>> I don't think we want that, e.g:
>>
>> postgres=# select pg_size_bytes('9223372036854775807.9');
>> ERROR:  invalid unit "9"
>> HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".
>>
>> I think making a mutable copy of the input string and truncating it before
>> passing to numeric_in() would make more sense--no need to hard-code
>> MAX_DIGITS.  The same goes for hard-coding MAX_UNIT_LEN, e.g. compare the
>> following two outputs:
>>
>> postgres=# select pg_size_bytes('1 KiB');
>> ERROR:  invalid unit "KiB"
>> HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".
>>
>> postgres=# select pg_size_bytes('1024 bytes');
>> ERROR:  invalid format
>>
>
> fixed

Hi,

Some feedback:

+  Converts a size in human-readable format with size units
+  into bytes.  The parameter is case insensitive string. Following
+  units are supported: kB, MB, GB, TB.

May I suggest:

"Converts a size in human-readable format with size units into bytes.
The parameter is case-insensitive, and the supported size units are
are: kB, MB, GB and TB."

+  * Convert human readable size to long int.

Big int?

+  * Due suppor decimal value and case insensitivity of units
+  * a function parse_intcannot be used.

Is this supposed to be saying:

"Due to support for decimal values..."?

and "a function parse_int cannot be used."?

+  * Use buffer as unit if there are not any nonspace char,
+  * else use a original unit string.

s/use a/use an/

+  * Now, the multiplier is in KB unit. It should be multiplied by 1024
+  * before usage

s/unit/units/

Regards

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Shulgin, Oleksandr
On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas  wrote:

> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
> wrote:
> > [ new patch ]
>
> + case '-':
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +  errmsg("size cannot be negative")));
>
> Why not?  I bet if you copy any - sign to the buffer, this will Just Work.
>

I'm also inclined on dropping that explicit check for empty string below
and let numeric_in() error out on that.  Does this look OK, or can it
confuse someone:

postgres=# select pg_size_bytes('');
ERROR:  invalid input syntax for type numeric: ""

?

+ if ( conv->base_unit == GUC_UNIT_KB &&
>

Between "(" and "conv->..." I believe.

---
Alex


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-01-04 Thread Pavel Stehule
2016-01-04 17:48 GMT+01:00 Shulgin, Oleksandr 
:

> On Mon, Jan 4, 2016 at 4:51 PM, Robert Haas  wrote:
>
>> On Mon, Jan 4, 2016 at 10:17 AM, Pavel Stehule 
>> wrote:
>> > [ new patch ]
>>
>> + case '-':
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> +  errmsg("size cannot be negative")));
>>
>> Why not?  I bet if you copy any - sign to the buffer, this will Just Work.
>>
>
> I'm also inclined on dropping that explicit check for empty string below
> and let numeric_in() error out on that.  Does this look OK, or can it
> confuse someone:
>
> postgres=# select pg_size_bytes('');
> ERROR:  invalid input syntax for type numeric: ""
>
> ?
>
> + if ( conv->base_unit == GUC_UNIT_KB &&
>>
>
> Between "(" and "conv->..." I believe.
>

both fixed

Regards

Pavel


>
> ---
> Alex
>
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 8ef9fce..df0a9f2
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17696 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units into bytes.
+  The parameter is case-insensitive, and the supported size units
+  are: kB, MB, GB and TB.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 2084692..0680108
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,30 
--- 25,31 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 701,808 
  }
  
  /*
+  * Convert human readable size to long int.
+  *
+  * Due to support for decimal values and case insensitivity of units
+  * a function parse_int cannot be used.
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text		   *arg = PG_GETARG_TEXT_PP(0);
+ 	char		   *str = text_to_cstring(arg);
+ 	char	*strptr = str;
+ 	char		   *buffer;
+ 	char	*bufptr;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* working buffer cannot be longer than original string */
+ 	buffer = (char *) palloc(VARSIZE_ANY_EXHDR(arg) + 1);
+ 	bufptr = buffer;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *strptr))
+ 		strptr++;
+ 
+ 	switch (*strptr)
+ 	{
+ 		/* ignore plus symbol */
+ 		case '+':
+ 		case '-':
+ 			*bufptr++ = *strptr++;
+ 			break;
+ 	}
+ 
+ 	/* copy digits to working buffer */
+ 	while (*strptr && (isdigit(*strptr) || *strptr == '.'))
+ 		*bufptr++ = *strptr++;
+ 	*bufptr = '\0';
+ 
+ 	/* don't allow empty string */
+ 	if (*buffer == '\0')
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("\"%s\" is not number", str)));
+ 
+ 	num = DatumGetNumeric(DirectFunctionCall3(numeric_in,
+ 			CStringGetDatum(buffer), 0, -1));
+ 
+ 	/* allow whitespace between integer and unit */
+ 	while (isspace(*strptr))
+ 		strptr++;
+ 
+ 	/* Handle possible unit */
+ 	if (*strptr != '\0')
+ 	{
+ 		int		multiplier;
+ 		Numeric			mul_num;
+ 		const char	*hintmsg;
+ 		const char *unitstr = strptr;
+ 
+ 		bufptr = buffer;
+ 
+ 		/* copy chars to buffer and stop on space */
+ 		while (*strptr && !isspace(*strptr))
+ 			*bufptr++ = *strptr++;
+ 		*bufptr = '\0';
+ 
+ 		/*
+ 		 * Use a buffer as unit if there are not any nonspace char,
+ 		 * else use an original unit string.
+ 		 */
+ 		while (isspace(*strptr))
+ 			strptr++;
+ 		if (*strptr == '\0')
+ 			unitstr = buffer;
+ 
+ 		if (!parse_memory_unit(unitstr, , ))
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid unit: \"%s\"", unitstr),
+ 	 errhint("%s", _(hintmsg;
+ 
+ 		/*
+ 		 * Now, the multiplier is in KB units. It should be multiplied by 1024
+ 		 * before usage
+ 		 */
+ 		mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric,
+ 		Int64GetDatum(multiplier * 1024L)));
+ 
+ 		num = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+ 			NumericGetDatum(mul_num),
+ 			NumericGetDatum(num)));
+ 	}
+ 
+ 	result = DatumGetInt64(DirectFunctionCall1(numeric_int8, NumericGetDatum(num)));
+ 
+ 	pfree(buffer);
+ 	pfree(str);
+ 
+ 	PG_RETURN_INT64(result);
+ }
+ 
+ /*
   * Get the filenode of a relation
   *
   * This is expected to be used in queries like
diff --git 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-30 Thread Robert Haas
On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
 wrote:
> I didn't check out earlier versions of this patch, but the latest one still
> changes pg_size_pretty() to emit PB suffix.
>
> I don't think it is worth it to throw a number of changes together like
> that.  We should focus on adding pg_size_bytes() first and make it
> compatible with both pg_size_pretty() and existing GUC units: that is
> support suffixes up to TB and make sure they have the meaning of powers of
> 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>
> Next, we could think about adding handling of PB suffix on input and output,
> but I don't see a big problem if that is emitted as 1024TB or the user has
> to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an minor
> inconvenience only.

+1 to everything in this email.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-30 Thread Pavel Stehule
2015-12-30 17:33 GMT+01:00 Robert Haas :

> On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
>  wrote:
> > I didn't check out earlier versions of this patch, but the latest one
> still
> > changes pg_size_pretty() to emit PB suffix.
> >
> > I don't think it is worth it to throw a number of changes together like
> > that.  We should focus on adding pg_size_bytes() first and make it
> > compatible with both pg_size_pretty() and existing GUC units: that is
> > support suffixes up to TB and make sure they have the meaning of powers
> of
> > 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
> >
> > Next, we could think about adding handling of PB suffix on input and
> output,
> > but I don't see a big problem if that is emitted as 1024TB or the user
> has
> > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an minor
> > inconvenience only.
>
> +1 to everything in this email.
>

so I removed support for PB and SI units. Now the
memory_unit_conversion_table is shared.

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 8ef9fce..6b921ae
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17696 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  into bytes.  The parameter is case insensitive string. Following
+  units are supported: kB, MB, GB, TB.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..3ec6d0c
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,30 
--- 25,31 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 699,704 
--- 700,807 
  	PG_RETURN_TEXT_P(cstring_to_text(result));
  }
  
+ #define MAX_DIGITS		20
+ #define MAX_UNIT_LEN		3
+ 
+ /*
+  * Convert human readable size to long int.
+  *
+  * Due suppor decimal value and case insensitivity of units
+  * a function parse_intcannot be used.
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text	   *arg = PG_GETARG_TEXT_PP(0);
+ 	const char	   *cp = text_to_cstring(arg);
+ 	char		digits[MAX_DIGITS + 1];
+ 	int		ndigits = 0;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	switch (*cp)
+ 	{
+ 		/* ignore plus symbol */
+ 		case '+':
+ 			cp++;
+ 			break;
+ 		case '-':
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("size cannot be negative")));
+ 	}
+ 
+ 	while (*cp)
+ 	{
+ 		if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
+ 			digits[ndigits++] = *cp++;
+ 		else
+ 			break;
+ 	}
+ 
+ 	if (ndigits == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid format")));
+ 
+ 	digits[ndigits] = '\0';
+ 	num = DatumGetNumeric(DirectFunctionCall3(numeric_in,
+ 			CStringGetDatum(digits), 0, -1));
+ 
+ 	/* allow whitespace between integer and unit */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	/* Handle possible unit */
+ 	if (*cp != '\0')
+ 	{
+ 		char		unit[MAX_UNIT_LEN + 1];
+ 		int		unitlen = 0;
+ 		int		multiplier;
+ 		Numeric			mul_num;
+ 		const char	*hintmsg;
+ 
+ 		while (*cp && !isspace(*cp) && unitlen < MAX_UNIT_LEN)
+ 			unit[unitlen++] = *cp++;
+ 
+ 		unit[unitlen] = '\0';
+ 
+ 		/* allow whitespace after unit */
+ 		while (isspace((unsigned char) *cp))
+ 			cp++;
+ 
+ 		if (*cp != '\0')
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid format")));
+ 
+ 		if (!parse_memory_unit(unit, , ))
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid unit \"%s\"", unit),
+ 	 errhint("%s", _(hintmsg;
+ 
+ 		/*
+ 		 * Now, the multiplier is in KB unit. It should be multiplied by 1024
+ 		 * before usage
+ 		 */
+ 		mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric,
+ 		Int64GetDatum(multiplier * 1024L)));
+ 
+ 		num = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+ 			NumericGetDatum(mul_num),
+ 			NumericGetDatum(num)));
+ 	}
+ 
+ 	result = 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-30 Thread Shulgin, Oleksandr
On Tue, Dec 29, 2015 at 7:15 PM, Pavel Stehule 
wrote:

>
>> I didn't check out earlier versions of this patch, but the latest one
>> still changes pg_size_pretty() to emit PB suffix.
>>
>> I don't think it is worth it to throw a number of changes together like
>> that.  We should focus on adding pg_size_bytes() first and make it
>> compatible with both pg_size_pretty() and existing GUC units: that is
>> support suffixes up to TB and make sure they have the meaning of powers of
>> 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>>
>> Next, we could think about adding handling of PB suffix on input and
>> output, but I don't see a big problem if that is emitted as 1024TB or the
>> user has to specify it as 1024TB in a GUC or argument to pg_size_bytes():
>> an minor inconvenience only.
>>
>
> Last version still support BP in pg_size_pretty. It isn't big change. PB
> isn't issue.
>
> We have to do significant decision - should to support SI units in
> pg_size_bytes? We cannot to change it later. There is disagreement for SI
> units in pg_size_pretty, so SI units in pg_size_bytes can be inconsistent.
>

There is no way at this point to add support for SI units in a consistent
and backwards-compatible manner: both GUC and pg_size_pretty() use SI
suffixes (kB, MB, GB, TB) with the meaning of 2^(10*n) (KiB, MiB, GiB,
TiB).  But given that the size relates to memory or disk space, it is quite
customary *not* to use SI units, so I don't see a point in adding those.

--
Alex


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-29 Thread Pavel Stehule
Hi


>>> -1 from me.  I don't think we should muck with the way pg_size_pretty
>>> works.
>>>
>>
>> new update - I reverted changes in pg_size_pretty
>>
>
> Hi,
>
> I didn't check out earlier versions of this patch, but the latest one
> still changes pg_size_pretty() to emit PB suffix.
>
> I don't think it is worth it to throw a number of changes together like
> that.  We should focus on adding pg_size_bytes() first and make it
> compatible with both pg_size_pretty() and existing GUC units: that is
> support suffixes up to TB and make sure they have the meaning of powers of
> 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>
> Next, we could think about adding handling of PB suffix on input and
> output, but I don't see a big problem if that is emitted as 1024TB or the
> user has to specify it as 1024TB in a GUC or argument to pg_size_bytes():
> an minor inconvenience only.
>

Last version still support BP in pg_size_pretty. It isn't big change. PB
isn't issue.

We have to do significant decision - should to support SI units in
pg_size_bytes? We cannot to change it later. There is disagreement for SI
units in pg_size_pretty, so SI units in pg_size_bytes can be inconsistent.

Regards

Pavel


>
> --
> Alex
>
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-28 Thread Shulgin, Oleksandr
On Tue, Dec 22, 2015 at 10:45 AM, Pavel Stehule 
wrote:

> Hi
>
> 2015-12-21 16:11 GMT+01:00 Robert Haas :
>
>> On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule 
>> wrote:
>> > new update:
>> >
>> > 1. unit searching is case insensitive
>> >
>> > 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC
>> standard),
>> > change behave for SI units
>> >
>> > Second point is much more complex then it is looking - if pg_size_bytes
>> > should be consistent with pg_size_pretty.
>> >
>> > The current pg_size_pretty and transformations in guc.c are based on
>> JEDEC
>> > standard. Using this standard for GUC has sense - using it for object
>> sizes
>> > is probably unhappy.
>> >
>> > I tried to fix (and enhance) pg_size_pretty - now reports correct
>> units, and
>> > via second parameter it allows to specify base: 2 (binary, IEC  -
>> default)
>> > or 10 (SI).
>>
>> -1 from me.  I don't think we should muck with the way pg_size_pretty
>> works.
>>
>
> new update - I reverted changes in pg_size_pretty
>

Hi,

I didn't check out earlier versions of this patch, but the latest one still
changes pg_size_pretty() to emit PB suffix.

I don't think it is worth it to throw a number of changes together like
that.  We should focus on adding pg_size_bytes() first and make it
compatible with both pg_size_pretty() and existing GUC units: that is
support suffixes up to TB and make sure they have the meaning of powers of
2^10, not 10^3.  Re-using the table present in guc.c would be a plus.

Next, we could think about adding handling of PB suffix on input and
output, but I don't see a big problem if that is emitted as 1024TB or the
user has to specify it as 1024TB in a GUC or argument to pg_size_bytes():
an minor inconvenience only.

--
Alex


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-22 Thread Pavel Stehule
Hi

2015-12-21 16:11 GMT+01:00 Robert Haas :

> On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule 
> wrote:
> > new update:
> >
> > 1. unit searching is case insensitive
> >
> > 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC
> standard),
> > change behave for SI units
> >
> > Second point is much more complex then it is looking - if pg_size_bytes
> > should be consistent with pg_size_pretty.
> >
> > The current pg_size_pretty and transformations in guc.c are based on
> JEDEC
> > standard. Using this standard for GUC has sense - using it for object
> sizes
> > is probably unhappy.
> >
> > I tried to fix (and enhance) pg_size_pretty - now reports correct units,
> and
> > via second parameter it allows to specify base: 2 (binary, IEC  -
> default)
> > or 10 (SI).
>
> -1 from me.  I don't think we should muck with the way pg_size_pretty
> works.
>

new update - I reverted changes in pg_size_pretty

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 60b9a09..a73f37b
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17696 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  into bytes.  The parameter is case insensitive string. Following
+  units are supported: B, kB, MB, GB, TB, PB, KiB, MiB, TiB, PiB.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..99f10dc
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 31,36 
--- 31,63 
  #include "utils/relmapper.h"
  #include "utils/syscache.h"
  
+ #define MAX_UNIT_LEN		3
+ #define MAX_DIGITS		20
+ 
+ typedef struct
+ {
+ 	char		unit[MAX_UNIT_LEN + 1];
+ 	long int	multiplier;
+ } unit_multiplier;
+ 
+ static const unit_multiplier unit_multiplier_table[] =
+ {
+ 	{"kiB", 1024L},
+ 	{"MiB", 1024L * 1024},
+ 	{"GiB", 1024L * 1024 * 1024},
+ 	{"TiB", 1024L * 1024 * 1024 * 1024},
+ 	{"PiB", 1024L * 1024 * 1024 * 1024 * 1024},
+ 	{"B", 1L},
+ 	{"KB", 1000L},
+ 	{"MB", 1000L * 1000},
+ 	{"GB", 1000L * 1000 * 1000},
+ 	{"TB", 1000L * 1000 * 1000 * 1000},
+ 	{"PB", 1000L * 1000 * 1000 * 1000 * 1000},
+ 
+ 	{""}/* end of table marker */
+ };
+ 
+ 
  /* Divide by two and round towards positive infinity. */
  #define half_rounded(x)   (((x) + ((x) < 0 ? 0 : 1)) / 2)
  
*** pg_size_pretty(PG_FUNCTION_ARGS)
*** 559,566 
  else
  {
  	size >>= 10;
! 	snprintf(buf, sizeof(buf), INT64_FORMAT " TB",
! 			 half_rounded(size));
  }
  			}
  		}
--- 586,600 
  else
  {
  	size >>= 10;
! 	if (Abs(size) < limit2)
! 		snprintf(buf, sizeof(buf), INT64_FORMAT " TB",
!  half_rounded(size));
! 	else
! 	{
! 		size >>= 10;
! 		snprintf(buf, sizeof(buf), INT64_FORMAT " PB",
!  half_rounded(size));
! 	}
  }
  			}
  		}
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 689,696 
  {
  	/* size >>= 10 */
  	size = numeric_shift_right(size, 10);
! 	size = numeric_half_rounded(size);
! 	result = psprintf("%s TB", numeric_to_cstring(size));
  }
  			}
  		}
--- 723,741 
  {
  	/* size >>= 10 */
  	size = numeric_shift_right(size, 10);
! 
! 	if (numeric_is_less(numeric_absolute(size), limit2))
! 	{
! 		size = numeric_half_rounded(size);
! 		result = psprintf("%s TB", numeric_to_cstring(size));
! 	}
! 	else
! 	{
! 		/* size >>= 10 */
! 		size = numeric_shift_right(size, 10);
! 		size = numeric_half_rounded(size);
! 		result = psprintf("%s PB", numeric_to_cstring(size));
! 	}
  }
  			}
  		}
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 745,853 
  }
  
  /*
+  * Convert human readable size to long int.
+  *
+  * cannot to use parse_int due too low limits there
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text	   *arg = PG_GETARG_TEXT_PP(0);
+ 	const char	   *cp = text_to_cstring(arg);
+ 	char		digits[MAX_DIGITS + 1];
+ 	int		ndigits = 0;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	switch (*cp)
+ 	{
+ 		/* ignore plus symbol */
+ 		case '+':
+ 			cp++;
+ 			break;
+ 		case '-':
+ 			ereport(ERROR,
+ 	

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-21 Thread Michael Paquier
On Tue, Dec 22, 2015 at 12:11 AM, Robert Haas  wrote:
> On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule  
> wrote:
>> new update:
>>
>> 1. unit searching is case insensitive
>>
>> 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC standard),
>> change behave for SI units
>>
>> Second point is much more complex then it is looking - if pg_size_bytes
>> should be consistent with pg_size_pretty.
>>
>> The current pg_size_pretty and transformations in guc.c are based on JEDEC
>> standard. Using this standard for GUC has sense - using it for object sizes
>> is probably unhappy.
>>
>> I tried to fix (and enhance) pg_size_pretty - now reports correct units, and
>> via second parameter it allows to specify base: 2 (binary, IEC  - default)
>> or 10 (SI).
>
> -1 from me.  I don't think we should muck with the way pg_size_pretty works.

Yeah.

+ static const unit_multiplier unit_multiplier_table[] =
+ {
+ {"B", 1L},
+ {"kiB", 1024L},
+ {"MiB", 1024L * 1024},
+ {"GiB", 1024L * 1024 * 1024},
+ {"TiB", 1024L * 1024 * 1024 * 1024},
+ {"PiB", 1024L * 1024 * 1024 * 1024 * 1024},
This is rather close to memory_unit_conversion_table in guc.c. Would
it be worth refactoring those unit tables into something more generic
instead of duplicating them?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-21 Thread Pavel Stehule
2015-12-22 6:57 GMT+01:00 Michael Paquier :

> On Tue, Dec 22, 2015 at 2:33 PM, Pavel Stehule 
> wrote:
> >
> >
> > 2015-12-22 6:22 GMT+01:00 Michael Paquier :
> >>
> >> On Tue, Dec 22, 2015 at 12:11 AM, Robert Haas 
> >> wrote:
> >> > On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule <
> pavel.steh...@gmail.com>
> >> > wrote:
> >> >> new update:
> >> >>
> >> >> 1. unit searching is case insensitive
> >> >>
> >> >> 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC
> >> >> standard),
> >> >> change behave for SI units
> >> >>
> >> >> Second point is much more complex then it is looking - if
> pg_size_bytes
> >> >> should be consistent with pg_size_pretty.
> >> >>
> >> >> The current pg_size_pretty and transformations in guc.c are based on
> >> >> JEDEC
> >> >> standard. Using this standard for GUC has sense - using it for object
> >> >> sizes
> >> >> is probably unhappy.
> >> >>
> >> >> I tried to fix (and enhance) pg_size_pretty - now reports correct
> >> >> units, and
> >> >> via second parameter it allows to specify base: 2 (binary, IEC  -
> >> >> default)
> >> >> or 10 (SI).
> >> >
> >> > -1 from me.  I don't think we should muck with the way pg_size_pretty
> >> > works.
> >>
> >> Yeah.
> >>
> >> + static const unit_multiplier unit_multiplier_table[] =
> >> + {
> >> + {"B", 1L},
> >> + {"kiB", 1024L},
> >> + {"MiB", 1024L * 1024},
> >> + {"GiB", 1024L * 1024 * 1024},
> >> + {"TiB", 1024L * 1024 * 1024 * 1024},
> >> + {"PiB", 1024L * 1024 * 1024 * 1024 * 1024},
> >> This is rather close to memory_unit_conversion_table in guc.c. Would
> >> it be worth refactoring those unit tables into something more generic
> >> instead of duplicating them?
> >
> >
> > yes, it is possible with following impacts:
> >
> > 1. We need add PB to memory_unit_conversion_table in guc.c
>
> No real objection to that. It would would make sense to have it, but
> we could not use it directly for a GUC. This just reminded me that
> even if we support TB in GUC params, it is not possible to set for
> example a GUC_UNIT_KB param to more than 2TB because those are limited
> to be int32.
>
> > 2. This table holds multipliers in JEDEC standard - and introduce other
> > standards IEC, SI there isn't good idea.
> >
> > Is it ok?
>
> Do you think it would be interesting to have GUC parameters able to
> use those units? If this is a standard, it may make sense to actually
> have them, no? Just a random thought, that's not something this patch
> should take care of, but it would be good to avoid code duplication
> where we can avoid it.
> Regards,
>

Nice to have it. But it can introduce a upgrade issues - there isn't
possible to specify base. If we do some change here, we have to do related
changes everywhere.

Our implementation is little bit obsolete, but nobody did some error
reports, so probably now isn't time to change it. It's not too important.

Better to have two conversion tables (simple and small), than opening new
topic, where I don't see any agreement to do changes - and these changes
can be done separately.

Pavel



> --
> Michael
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-21 Thread Pavel Stehule
2015-12-22 6:22 GMT+01:00 Michael Paquier :

> On Tue, Dec 22, 2015 at 12:11 AM, Robert Haas 
> wrote:
> > On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule 
> wrote:
> >> new update:
> >>
> >> 1. unit searching is case insensitive
> >>
> >> 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC
> standard),
> >> change behave for SI units
> >>
> >> Second point is much more complex then it is looking - if pg_size_bytes
> >> should be consistent with pg_size_pretty.
> >>
> >> The current pg_size_pretty and transformations in guc.c are based on
> JEDEC
> >> standard. Using this standard for GUC has sense - using it for object
> sizes
> >> is probably unhappy.
> >>
> >> I tried to fix (and enhance) pg_size_pretty - now reports correct
> units, and
> >> via second parameter it allows to specify base: 2 (binary, IEC  -
> default)
> >> or 10 (SI).
> >
> > -1 from me.  I don't think we should muck with the way pg_size_pretty
> works.
>
> Yeah.
>
> + static const unit_multiplier unit_multiplier_table[] =
> + {
> + {"B", 1L},
> + {"kiB", 1024L},
> + {"MiB", 1024L * 1024},
> + {"GiB", 1024L * 1024 * 1024},
> + {"TiB", 1024L * 1024 * 1024 * 1024},
> + {"PiB", 1024L * 1024 * 1024 * 1024 * 1024},
> This is rather close to memory_unit_conversion_table in guc.c. Would
> it be worth refactoring those unit tables into something more generic
> instead of duplicating them?
>

yes, it is possible with following impacts:

1. We need add PB to memory_unit_conversion_table in guc.c
2. This table holds multipliers in JEDEC standard - and introduce other
standards IEC, SI there isn't good idea.

Is it ok?

Regards

Pavel



> --
> Michael
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-21 Thread Michael Paquier
On Tue, Dec 22, 2015 at 2:33 PM, Pavel Stehule  wrote:
>
>
> 2015-12-22 6:22 GMT+01:00 Michael Paquier :
>>
>> On Tue, Dec 22, 2015 at 12:11 AM, Robert Haas 
>> wrote:
>> > On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule 
>> > wrote:
>> >> new update:
>> >>
>> >> 1. unit searching is case insensitive
>> >>
>> >> 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC
>> >> standard),
>> >> change behave for SI units
>> >>
>> >> Second point is much more complex then it is looking - if pg_size_bytes
>> >> should be consistent with pg_size_pretty.
>> >>
>> >> The current pg_size_pretty and transformations in guc.c are based on
>> >> JEDEC
>> >> standard. Using this standard for GUC has sense - using it for object
>> >> sizes
>> >> is probably unhappy.
>> >>
>> >> I tried to fix (and enhance) pg_size_pretty - now reports correct
>> >> units, and
>> >> via second parameter it allows to specify base: 2 (binary, IEC  -
>> >> default)
>> >> or 10 (SI).
>> >
>> > -1 from me.  I don't think we should muck with the way pg_size_pretty
>> > works.
>>
>> Yeah.
>>
>> + static const unit_multiplier unit_multiplier_table[] =
>> + {
>> + {"B", 1L},
>> + {"kiB", 1024L},
>> + {"MiB", 1024L * 1024},
>> + {"GiB", 1024L * 1024 * 1024},
>> + {"TiB", 1024L * 1024 * 1024 * 1024},
>> + {"PiB", 1024L * 1024 * 1024 * 1024 * 1024},
>> This is rather close to memory_unit_conversion_table in guc.c. Would
>> it be worth refactoring those unit tables into something more generic
>> instead of duplicating them?
>
>
> yes, it is possible with following impacts:
>
> 1. We need add PB to memory_unit_conversion_table in guc.c

No real objection to that. It would would make sense to have it, but
we could not use it directly for a GUC. This just reminded me that
even if we support TB in GUC params, it is not possible to set for
example a GUC_UNIT_KB param to more than 2TB because those are limited
to be int32.

> 2. This table holds multipliers in JEDEC standard - and introduce other
> standards IEC, SI there isn't good idea.
>
> Is it ok?

Do you think it would be interesting to have GUC parameters able to
use those units? If this is a standard, it may make sense to actually
have them, no? Just a random thought, that's not something this patch
should take care of, but it would be good to avoid code duplication
where we can avoid it.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-21 Thread Robert Haas
On Sun, Dec 20, 2015 at 4:54 AM, Pavel Stehule  wrote:
> new update:
>
> 1. unit searching is case insensitive
>
> 2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC standard),
> change behave for SI units
>
> Second point is much more complex then it is looking - if pg_size_bytes
> should be consistent with pg_size_pretty.
>
> The current pg_size_pretty and transformations in guc.c are based on JEDEC
> standard. Using this standard for GUC has sense - using it for object sizes
> is probably unhappy.
>
> I tried to fix (and enhance) pg_size_pretty - now reports correct units, and
> via second parameter it allows to specify base: 2 (binary, IEC  - default)
> or 10 (SI).

-1 from me.  I don't think we should muck with the way pg_size_pretty works.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-20 Thread Pavel Stehule
Hi

new update:

1. unit searching is case insensitive

2. initial support for binary byte prefixes - KiB, MiB, ..  (IEC standard),
change behave for SI units

Second point is much more complex then it is looking - if pg_size_bytes
should be consistent with pg_size_pretty.

The current pg_size_pretty and transformations in guc.c are based on JEDEC
standard. Using this standard for GUC has sense - using it for object sizes
is probably unhappy.

I tried to fix (and enhance) pg_size_pretty - now reports correct units,
and via second parameter it allows to specify base: 2 (binary, IEC  -
default) or 10 (SI).

I think it is good to have it. These standards are generic and wide used,
but should to be pretty explained in documentation if we will use JEDEC for
configuration. Probably better to leave JEDEC and prefer SI and IEC.

Plan B is fix Postgres on JEDEC only - it is trivial, simple - but it can
look like archaic in next years.

Comments, notices?

Regards

Pavel
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 60b9a09..d9a4f34
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17678,17699 


 
! pg_size_pretty(bigint)
  
 text
 
   Converts a size in bytes expressed as a 64-bit integer into a
!  human-readable format with size units
 


 
! pg_size_pretty(numeric)
  
 text
 
   Converts a size in bytes expressed as a numeric value into a
!  human-readable format with size units
 


--- 17681,17715 


 
! pg_size_bytes(text)
! 
!bigint
!
!  Converts a size in human-readable format with size units
!  into bytes. The parameter is case insensitive string. Following
!  units are supported: B, kB, MB, GB, TB, PB, KiB, MiB, TiB, PiB.
!
!   
!   
!
! pg_size_pretty(bigint , int)
  
 text
 
   Converts a size in bytes expressed as a 64-bit integer into a
!  human-readable format with size units. Second parameter allows to
!  specify the base (2 or 10). The binary base is default.
 


 
! pg_size_pretty(numeric , int)
  
 text
 
   Converts a size in bytes expressed as a numeric value into a
!  human-readable format with size units. Second parameter allows to
!  specify the base (2 or 10). The binary base is default.
 


diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..819da35
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 31,36 
--- 31,64 
  #include "utils/relmapper.h"
  #include "utils/syscache.h"
  
+ #define MAX_UNIT_LEN		3
+ #define MAX_DIGITS		20
+ 
+ typedef struct
+ {
+ 	char		unit[MAX_UNIT_LEN + 1];
+ 	long int	multiplier;
+ } unit_multiplier;
+ 
+ static const unit_multiplier unit_multiplier_table[] =
+ {
+ 	{"B", 1L},
+ 	{"kiB", 1024L},
+ 	{"MiB", 1024L * 1024},
+ 	{"GiB", 1024L * 1024 * 1024},
+ 	{"TiB", 1024L * 1024 * 1024 * 1024},
+ 	{"PiB", 1024L * 1024 * 1024 * 1024 * 1024},
+ 	{"B", 1L},
+ 	{"kB", 1000L},
+ 	{"MB", 1000L * 1000},
+ 	{"GB", 1000L * 1000 * 1000},
+ 	{"TB", 1000L * 1000 * 1000 * 1000},
+ 	{"PB", 1000L * 1000 * 1000 * 1000 * 1000},
+ 
+ 	{""}/* end of table marker */
+ };
+ 
+ 
  /* Divide by two and round towards positive infinity. */
  #define half_rounded(x)   (((x) + ((x) < 0 ? 0 : 1)) / 2)
  
*** calculate_table_size(Relation rel)
*** 409,415 
   * Calculate total on-disk size of all indexes attached to the given table.
   *
   * Can be applied safely to an index, but you'll just get zero.
!  */
  static int64
  calculate_indexes_size(Relation rel)
  {
--- 437,443 
   * Calculate total on-disk size of all indexes attached to the given table.
   *
   * Can be applied safely to an index, but you'll just get zero.
!  */	
  static int64
  calculate_indexes_size(Relation rel)
  {
*** pg_total_relation_size(PG_FUNCTION_ARGS)
*** 526,574 
  }
  
  /*
!  * formatting with size units
   */
  Datum
  pg_size_pretty(PG_FUNCTION_ARGS)
  {
  	int64		size = PG_GETARG_INT64(0);
  	char		buf[64];
- 	int64		limit = 10 * 1024;
- 	int64		limit2 = limit * 2 - 1;
  
! 	if (Abs(size) < limit)
! 		snprintf(buf, sizeof(buf), INT64_FORMAT " bytes", size);
! 	else
  	{
! 		size >>= 9;/* keep one extra bit for rounding */
! 		if (Abs(size) < limit2)
! 			snprintf(buf, sizeof(buf), INT64_FORMAT " kB",

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-01 Thread Kyotaro HORIGUCHI
Hello, The meaning of "be orange" (I couldn't find it) interests
me but putting it aside..

I have some random comments.

At Mon, 30 Nov 2015 18:30:18 +0100, Pavel Stehule  
wrote in 
> Hi
> 
> 
> > 2. using independent implementation - there is some redundant code, but we
> > can support duble insted int, and we can support some additional units.
> >
> 
>  new patch is based on own implementation - use numeric/bigint calculations
> 
> + regress tests and doc

1. What do you think about binary byte prefixes? (kiB, MiB...)

 I couldn't read what Robert wrote upthread but I also would like
 to have 2-base byte unit prifixes. (Sorry I couldn't put it
 aside..)


2. Doesn't it allow units in lower case?

 I think '1gb' and so should be acceptable as an input.


3. Why are you allow positive sign prefixing digits? I suppose
  that positive sign also shouldn't be allowed if it rejects
  prifixing negative sign.

4. Useless includes

 dbsize.c is modified to include guc.h but it is useless.

5. Error message

+   if (ndigits == 0)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("string is empty")));

 If pg_size_bytes allows prefixing positive sign or spaces,
 ndigits == 0 here doesn't mean that the whole string is empty.


6. Symmetry with pg_size_pretty

 pg_size_pretty doesn't have units larger than TB. Just adding PB
 to it breaks backward compatibility, but leaving as it is breaks
 symmetry with this function. Some possible solution for this
 that I can guess for now are..

  - Leave it as is.

  - New GUC to allow PB for pg_size_pretty().

  - Expanded function such like pg_size_pretty2 (oops..)

  - New polymorphic (sibling?) function with additional
parameters to instruct how they work. (The following
involving 2-base representations)

pg_size_pretty(n, <2base>, );
pg_size_bytes(n, <2base>, );

7. Docs

+  Converts a size in human-readable format with size units
+  to bytes

 The descriptions for the functions around use 'into' instead of
 'to' for the preposition.


8. Regression

 The regression in the patch looks good enough as is (except that
 it forgets the unit 'B' or prifixing spaces or sings), but they
 are necessarily have individual tests. The following SQL
 statement will do them at once.

  SELECT s, pg_size_bytes(s) FROM (VALUES ('1'), ('1B')...) as t(s);

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-01 Thread Pavel Stehule
2015-12-01 11:02 GMT+01:00 Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp>:

> Hello, The meaning of "be orange" (I couldn't find it) interests
> me but putting it aside..
>
> I have some random comments.
>
> At Mon, 30 Nov 2015 18:30:18 +0100, Pavel Stehule 
> wrote in <
> cafj8prcd6we3tqmr0vbcn98wf0p3o3h6nau4btaoswcj443...@mail.gmail.com>
> > Hi
> >
> >
> > > 2. using independent implementation - there is some redundant code,
> but we
> > > can support duble insted int, and we can support some additional units.
> > >
> >
> >  new patch is based on own implementation - use numeric/bigint
> calculations
> >
> > + regress tests and doc
>
> 1. What do you think about binary byte prefixes? (kiB, MiB...)
>

I don't know this mechanics - can you write it? It can be good idea/


>
>  I couldn't read what Robert wrote upthread but I also would like
>  to have 2-base byte unit prifixes. (Sorry I couldn't put it
>  aside..)
>
>
> 2. Doesn't it allow units in lower case?
>

The parser is consistent with a behave used in configure file processing.
We can change it, but then there will be divergence between this function
and GUC parser.


>
>  I think '1gb' and so should be acceptable as an input.
>
>
> 3. Why are you allow positive sign prefixing digits? I suppose
>   that positive sign also shouldn't be allowed if it rejects
>   prifixing negative sign.
>

I have not strong opinion about it. '-' is exactly wrong, but '+' can be
acceptable. But it can be changed.


>
> 4. Useless includes
>
>  dbsize.c is modified to include guc.h but it is useless.
>

I'll fix it.


>
> 5. Error message
>
> +   if (ndigits == 0)
> +   ereport(ERROR,
> +   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +errmsg("string is empty")));
>
>  If pg_size_bytes allows prefixing positive sign or spaces,
>  ndigits == 0 here doesn't mean that the whole string is empty.
>
>
I'll fix it.


>
> 6. Symmetry with pg_size_pretty
>
>  pg_size_pretty doesn't have units larger than TB. Just adding PB
>  to it breaks backward compatibility, but leaving as it is breaks
>  symmetry with this function. Some possible solution for this
>  that I can guess for now are..
>

I prefer a warning about possible compatibility issue in pg_size_pretty and
support PB directly there.


>
>   - Leave it as is.
>
>   - New GUC to allow PB for pg_size_pretty().
>
>   - Expanded function such like pg_size_pretty2 (oops..)
>
>   - New polymorphic (sibling?) function with additional
> parameters to instruct how they work. (The following
> involving 2-base representations)
>
> pg_size_pretty(n, <2base>, );
> pg_size_bytes(n, <2base>, );
>
> 7. Docs
>
> +  Converts a size in human-readable format with size units
> +  to bytes
>
>  The descriptions for the functions around use 'into' instead of
>  'to' for the preposition.
>
>
> 8. Regression
>
>  The regression in the patch looks good enough as is (except that
>  it forgets the unit 'B' or prifixing spaces or sings), but they
>  are necessarily have individual tests. The following SQL
>  statement will do them at once.
>
>   SELECT s, pg_size_bytes(s) FROM (VALUES ('1'), ('1B')...) as t(s);
>
>
I'll fix it.

Thank you for your ideas

Regards

Pavel


> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
>
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-01 Thread Pavel Stehule
2015-12-01 11:12 GMT+01:00 Pavel Stehule :

>
>
> 2015-12-01 11:02 GMT+01:00 Kyotaro HORIGUCHI <
> horiguchi.kyot...@lab.ntt.co.jp>:
>
>> Hello, The meaning of "be orange" (I couldn't find it) interests
>> me but putting it aside..
>>
>> I have some random comments.
>>
>> At Mon, 30 Nov 2015 18:30:18 +0100, Pavel Stehule <
>> pavel.steh...@gmail.com> wrote in <
>> cafj8prcd6we3tqmr0vbcn98wf0p3o3h6nau4btaoswcj443...@mail.gmail.com>
>> > Hi
>> >
>> >
>> > > 2. using independent implementation - there is some redundant code,
>> but we
>> > > can support duble insted int, and we can support some additional
>> units.
>> > >
>> >
>> >  new patch is based on own implementation - use numeric/bigint
>> calculations
>> >
>> > + regress tests and doc
>>
>> 1. What do you think about binary byte prefixes? (kiB, MiB...)
>>
>
> I don't know this mechanics - can you write it? It can be good idea/
>
>
>>
>>  I couldn't read what Robert wrote upthread but I also would like
>>  to have 2-base byte unit prifixes. (Sorry I couldn't put it
>>  aside..)
>>
>>
>> 2. Doesn't it allow units in lower case?
>>
>
> The parser is consistent with a behave used in configure file processing.
> We can change it, but then there will be divergence between this function
> and GUC parser.
>
>
>>
>>  I think '1gb' and so should be acceptable as an input.
>>
>>
>> 3. Why are you allow positive sign prefixing digits? I suppose
>>   that positive sign also shouldn't be allowed if it rejects
>>   prifixing negative sign.
>>
>
> I have not strong opinion about it. '-' is exactly wrong, but '+' can be
> acceptable. But it can be changed.
>
>
>>
>> 4. Useless includes
>>
>>  dbsize.c is modified to include guc.h but it is useless.
>>
>
> I'll fix it.
>
>
>>
>> 5. Error message
>>
>> +   if (ndigits == 0)
>> +   ereport(ERROR,
>> +   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> +errmsg("string is empty")));
>>
>>  If pg_size_bytes allows prefixing positive sign or spaces,
>>  ndigits == 0 here doesn't mean that the whole string is empty.
>>
>>
> I'll fix it.
>
>
>>
>> 6. Symmetry with pg_size_pretty
>>
>>  pg_size_pretty doesn't have units larger than TB. Just adding PB
>>  to it breaks backward compatibility, but leaving as it is breaks
>>  symmetry with this function. Some possible solution for this
>>  that I can guess for now are..
>>
>
> I prefer a warning about possible compatibility issue in pg_size_pretty
> and support PB directly there.
>
>
>>
>>   - Leave it as is.
>>
>>   - New GUC to allow PB for pg_size_pretty().
>>
>>   - Expanded function such like pg_size_pretty2 (oops..)
>>
>>   - New polymorphic (sibling?) function with additional
>> parameters to instruct how they work. (The following
>> involving 2-base representations)
>>
>> pg_size_pretty(n, <2base>, );
>> pg_size_bytes(n, <2base>, );
>>
>> 7. Docs
>>
>> +  Converts a size in human-readable format with size units
>> +  to bytes
>>
>>  The descriptions for the functions around use 'into' instead of
>>  'to' for the preposition.
>>
>>
>> 8. Regression
>>
>>  The regression in the patch looks good enough as is (except that
>>  it forgets the unit 'B' or prifixing spaces or sings), but they
>>  are necessarily have individual tests. The following SQL
>>  statement will do them at once.
>>
>>   SELECT s, pg_size_bytes(s) FROM (VALUES ('1'), ('1B')...) as t(s);
>>
>>
> I'll fix it.
>

here is updated patch

Regards

Pavel


>
> Thank you for your ideas
>
> Regards
>
> Pavel
>
>
>> regards,
>>
>> --
>> Kyotaro Horiguchi
>> NTT Open Source Software Center
>>
>>
>>
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 60b9a09..55e7d35
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17695 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  into bytes.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..0e7399c
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 31,36 
--- 31,58 
  #include "utils/relmapper.h"
  #include "utils/syscache.h"
  
+ #define MAX_UNIT_LEN		3
+ #define MAX_DIGITS		20
+ 
+ typedef struct
+ {
+ 	char		unit[MAX_UNIT_LEN + 1];
+ 	long int	multiplier;
+ } unit_multiplier;
+ 
+ static const unit_multiplier unit_multiplier_table[] =
+ {
+ 	{"PB", 1024L * 1024 * 1024 * 1024 * 1024},

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-30 Thread Pavel Stehule
Hi


> 2. using independent implementation - there is some redundant code, but we
> can support duble insted int, and we can support some additional units.
>

 new patch is based on own implementation - use numeric/bigint calculations

+ regress tests and doc

Regards

Pavel
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 60b9a09..c94743e
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17695 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  to bytes
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..a87c6a7
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,36 
--- 25,59 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
  #include "utils/relmapper.h"
  #include "utils/syscache.h"
  
+ #define MAX_UNIT_LEN		3
+ #define MAX_DIGITS		20
+ 
+ typedef struct
+ {
+ 	char		unit[MAX_UNIT_LEN + 1];
+ 	long int	multiplier;
+ } unit_multiplier;
+ 
+ static const unit_multiplier unit_multiplier_table[] =
+ {
+ 	{"PB", 1024L * 1024 * 1024 * 1024 * 1024},
+ 	{"TB", 1024L * 1024 * 1024 * 1024},
+ 	{"GB", 1024L * 1024 * 1024},
+ 	{"MB", 1024L * 1024},
+ 	{"kB", 1024L},
+ 	{"B", 1L},
+ 
+ 	{""}/* end of table marker */
+ };
+ 
+ 
  /* Divide by two and round towards positive infinity. */
  #define half_rounded(x)   (((x) + ((x) < 0 ? 0 : 1)) / 2)
  
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 723,830 
  }
  
  /*
+  * Convert human readable size to long int.
+  *
+  * cannot to use parse_int due too low limits there
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text	   *arg = PG_GETARG_TEXT_PP(0);
+ 	const char	   *cp = text_to_cstring(arg);
+ 	char		digits[MAX_DIGITS + 1];
+ 	int		ndigits = 0;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	switch (*cp)
+ 	{
+ 		case '+':
+ 			cp++;
+ 			break;
+ 		case '-':
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("size cannot be negative")));
+ 	}
+ 
+ 	while (*cp)
+ 	{
+ 		if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
+ 			digits[ndigits++] = *cp++;
+ 		else
+ 			break;
+ 	}
+ 
+ 	if (ndigits == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("string is empty")));
+ 
+ 	digits[ndigits] = '\0';
+ 	num = DatumGetNumeric(DirectFunctionCall3(numeric_in,
+ 			CStringGetDatum(digits), 0, -1));
+ 
+ 	/* allow whitespace between integer and unit */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	/* Handle possible unit */
+ 	if (*cp != '\0')
+ 	{
+ 		char		unit[MAX_UNIT_LEN + 1];
+ 		int		unitlen = 0;
+ 		int		i;
+ 		long int	multiplier;
+ 		bool			found = false;
+ 		Numeric			mul_num;
+ 
+ 
+ 		while (*cp && !isspace(*cp) && unitlen < MAX_UNIT_LEN)
+ 			unit[unitlen++] = *cp++;
+ 
+ 		unit[unitlen] = '\0';
+ 
+ 		/* allow whitespace after unit */
+ 		while (isspace((unsigned char) *cp))
+ 			cp++;
+ 
+ 		if (*cp != '\0')
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid format")));
+ 
+ 		for (i = 0; *unit_multiplier_table[i].unit; i++)
+ 		{
+ 			if (strcmp(unit, unit_multiplier_table[i].unit) == 0)
+ 			{
+ multiplier = unit_multiplier_table[i].multiplier;
+ found = true;
+ break;
+ 			}
+ 		}
+ 
+ 		if (!found)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("unknown unit \"%s\"", unit)));
+ 
+ 		mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum(multiplier)));
+ 		num = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+ 			NumericGetDatum(mul_num),
+ 			NumericGetDatum(num)));
+ 	}
+ 
+ 	result = DatumGetInt64(DirectFunctionCall1(numeric_int8, NumericGetDatum(num)));
+ 
+ 	PG_RETURN_INT64(result);
+ }
+ 
+ /*
   * Get the filenode of a relation
   *
   * This is expected to be used in queries like
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index d8640db..b68c8fa
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*** DATA(insert OID = 2286 ( pg_total_relati
*** 3662,3667 
--- 3662,3669 
  DESCR("total disk space usage for the specified 

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-25 Thread Kevin Grittner
On Wed, Nov 25, 2015 at 12:57 AM, Jim Nasby  wrote:

> Should read " isn't a valid size value"

http://www.postgresql.org/docs/devel/static/error-style-guide.html#AEN110918

| Contractions. Avoid contractions, like "can't"; use "cannot" instead.

So " is not a valid size value" would be better.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-24 Thread Pavel Stehule
Hi

2015-11-23 19:47 GMT+01:00 Alvaro Herrera :

> Pavel Stehule wrote:
>
> > so pg_size_bytes is good enough for everybody?
>
> That seems good enough to me.
>
> I would have it accept GiB and GB and have both transform to base 2, and
> have an optional boolean flag whose non-default value turns the GB
> interpretation into base 10, leaving the GiB interpretation unaffected.
>

attached proof concept based on parser "parse_int" from guc.c

It works well to 1TB what is enough for memory setting, but too low for
proposed target.

There are two ways

1. enhance the "parse_int"

2. using independent implementation - there is some redundant code, but we
can support duble insted int, and we can support some additional units.

Regards

Pavel


> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..3fcd203
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,30 
--- 25,31 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 701,728 
  }
  
  /*
+  * Convert human readable size to long int
+  *
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text	   *arg = PG_GETARG_TEXT_PP(0);
+ 	char	   *str = text_to_cstring(arg);
+ 	const char *hintmsg;
+ 	int	   result;
+ 
+ 	if (!parse_int(str, , GUC_UNIT_KB, ))
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			 errmsg("parameter \"%s\" isn't valid size value",
+ 	str),
+ 			hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ 
+ 	PG_RETURN_INT64(((int64)result) * 1024);
+ }
+ 
+ /*
   * Get the filenode of a relation
   *
   * This is expected to be used in queries like
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index d8640db..b68c8fa
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*** DATA(insert OID = 2286 ( pg_total_relati
*** 3662,3667 
--- 3662,3669 
  DESCR("total disk space usage for the specified table and associated indexes");
  DATA(insert OID = 2288 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 25 "20" _null_ _null_ _null_ _null_ _null_ pg_size_pretty _null_ _null_ _null_ ));
  DESCR("convert a long int to a human readable text using size units");
+ DATA(insert OID = 3317 ( pg_size_bytes			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 20 "25" _null_ _null_ _null_ _null_ _null_ pg_size_bytes _null_ _null_ _null_ ));
+ DESCR("convert a human readable text with size units to long int bytes");
  DATA(insert OID = 3166 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 25 "1700" _null_ _null_ _null_ _null_ _null_ pg_size_pretty_numeric _null_ _null_ _null_ ));
  DESCR("convert a numeric to a human readable text using size units");
  DATA(insert OID = 2997 ( pg_table_size			PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 20 "2205" _null_ _null_ _null_ _null_ _null_ pg_table_size _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
new file mode 100644
index e610bf3..227e5f5
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
*** extern Datum pg_relation_size(PG_FUNCTIO
*** 462,467 
--- 462,468 
  extern Datum pg_total_relation_size(PG_FUNCTION_ARGS);
  extern Datum pg_size_pretty(PG_FUNCTION_ARGS);
  extern Datum pg_size_pretty_numeric(PG_FUNCTION_ARGS);
+ extern Datum pg_size_bytes(PG_FUNCTION_ARGS);
  extern Datum pg_table_size(PG_FUNCTION_ARGS);
  extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
  extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-24 Thread Jim Nasby

On 11/24/15 10:57 PM, Pavel Stehule wrote:

+errmsg("parameter \"%s\" isn't valid size value",


Should read " isn't a valid size value"
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Gavin Flower

On 24/11/15 06:31, Pavel Stehule wrote:



2015-11-23 18:04 GMT+01:00 Tom Lane >:


Jim Nasby  writes:
> On 11/23/15 3:11 AM, Corey Huinker wrote:
>> +1 to both pg_size_bytes() and ::bytesize. Both contribute to
making the
>> statements more self-documenting.

> The function seems like overkill to me if we have the type. Just my
> opinion though. I'm thinking the type could just be called
'size' too
> (or prettysize?). No reason it has to be tied to bytes (in
particular
> this would work for bits too).

Please, no.  That's *way* too generic a name.

I do not actually agree with making a type for this anyway.  I can
tolerate a function, but adding a datatype is overkill; and it will
introduce far more definitional issues than it's worth. (eg, which
other types should have casts to/from it, and at what level)


so pg_size_bytes is good enough for everybody?

Regards

Pavel


regards, tom lane


perhaps pg_size_bites for those people who want: KiB,  MiB, GiB, TiB, 
PiB, ,..   ???:-)




Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Corey Huinker
On Sun, Nov 22, 2015 at 11:24 PM, Pavel Stehule 
wrote:

>
>
> 2015-11-22 23:54 GMT+01:00 Corey Huinker :
>
>> What about pg_size_unpretty()?
>>>
>> I was going to suggest pg_size_ugly(), but unpretty does emphasize the
>> inverse (rather than opposite) nature of the function.
>>
>
> "unpretty", "ugly" aren't good names
>
> maybe pg_size_bytes or different approach
>
> we can introduce data type - bytesize - default input/output is human
> readable text - and conversion to bigint is implicit
>
> Some like
>
> select .. where pg_table_size(oid) > bytesize '3.5GB'
>
> and instead pg_size_pretty(pg_table_size(oid)) we can write
> pg_table_size(oid)::bytesize
>
> Regards
>
> Pavel
>

+1 to both pg_size_bytes() and ::bytesize. Both contribute to making the
statements more self-documenting.


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Robert Haas
On Mon, Nov 23, 2015 at 1:47 PM, Alvaro Herrera
 wrote:
> Pavel Stehule wrote:
>
>> so pg_size_bytes is good enough for everybody?
>
> That seems good enough to me.
>
> I would have it accept GiB and GB and have both transform to base 2, and
> have an optional boolean flag whose non-default value turns the GB
> interpretation into base 10, leaving the GiB interpretation unaffected.

I think it should be orange.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Tom Lane
Jim Nasby  writes:
> On 11/23/15 3:11 AM, Corey Huinker wrote:
>> +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the
>> statements more self-documenting.

> The function seems like overkill to me if we have the type. Just my 
> opinion though. I'm thinking the type could just be called 'size' too 
> (or prettysize?). No reason it has to be tied to bytes (in particular 
> this would work for bits too).

Please, no.  That's *way* too generic a name.

I do not actually agree with making a type for this anyway.  I can
tolerate a function, but adding a datatype is overkill; and it will
introduce far more definitional issues than it's worth.  (eg, which
other types should have casts to/from it, and at what level)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Alvaro Herrera
Pavel Stehule wrote:

> so pg_size_bytes is good enough for everybody?

That seems good enough to me.

I would have it accept GiB and GB and have both transform to base 2, and
have an optional boolean flag whose non-default value turns the GB
interpretation into base 10, leaving the GiB interpretation unaffected.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Jim Nasby

On 11/23/15 3:11 AM, Corey Huinker wrote:

+1 to both pg_size_bytes() and ::bytesize. Both contribute to making the
statements more self-documenting.


The function seems like overkill to me if we have the type. Just my 
opinion though. I'm thinking the type could just be called 'size' too 
(or prettysize?). No reason it has to be tied to bytes (in particular 
this would work for bits too).


If we're going to add this, I suppose it should support the 'i prefixes' 
too (GiB, MiB, etc).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-23 Thread Pavel Stehule
2015-11-23 18:04 GMT+01:00 Tom Lane :

> Jim Nasby  writes:
> > On 11/23/15 3:11 AM, Corey Huinker wrote:
> >> +1 to both pg_size_bytes() and ::bytesize. Both contribute to making the
> >> statements more self-documenting.
>
> > The function seems like overkill to me if we have the type. Just my
> > opinion though. I'm thinking the type could just be called 'size' too
> > (or prettysize?). No reason it has to be tied to bytes (in particular
> > this would work for bits too).
>
> Please, no.  That's *way* too generic a name.
>
> I do not actually agree with making a type for this anyway.  I can
> tolerate a function, but adding a datatype is overkill; and it will
> introduce far more definitional issues than it's worth.  (eg, which
> other types should have casts to/from it, and at what level)
>

so pg_size_bytes is good enough for everybody?

Regards

Pavel


>
> regards, tom lane
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Corey Huinker
>
> What about pg_size_unpretty()?
>
I was going to suggest pg_size_ugly(), but unpretty does emphasize the
inverse (rather than opposite) nature of the function.


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Guillaume Lelarge
Le 22 nov. 2015 21:29, "Pavel Stehule"  a écrit :
>
>
>
> 2015-11-22 21:19 GMT+01:00 Jim Nasby :
>>
>> On 11/22/15 2:11 PM, Pavel Stehule wrote:
>>>
>>> What about pg_size(text), pg_size(value bigint, unit text) ?
>>
>>
>> I like, though I'd make it numeric or float. pg_size(3.5, 'GB')
certainly seems like a reasonable use case...
>
>
> yes, good note.
>

What about pg_size_unpretty()?


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Pavel Stehule
2015-11-22 23:54 GMT+01:00 Corey Huinker :

> What about pg_size_unpretty()?
>>
> I was going to suggest pg_size_ugly(), but unpretty does emphasize the
> inverse (rather than opposite) nature of the function.
>

"unpretty", "ugly" aren't good names

maybe pg_size_bytes or different approach

we can introduce data type - bytesize - default input/output is human
readable text - and conversion to bigint is implicit

Some like

select .. where pg_table_size(oid) > bytesize '3.5GB'

and instead pg_size_pretty(pg_table_size(oid)) we can write
pg_table_size(oid)::bytesize

Regards

Pavel


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Pavel Stehule
2015-11-22 18:30 GMT+01:00 Jim Nasby :

> On 11/21/15 12:49 AM, Pavel Stehule wrote:
>
>>
>> I propose inversion function to pg_size_pretty function - like
>> pg_human_size.
>>
>> Usage:
>>
>> SELECT * FROM pg_class
>>WHERE pg_table_size(oid) > pg_human_size('2GB');
>>
>
> I'm not a big fan of the name, but +1 on the general idea.
>

I am for any other good name


>
> Maybe pg_size_pretty(text)?
>

I understand to your idea, but it can be too strange - function and
inversion function share same name.

What about pg_size(text), pg_size(value bigint, unit text) ?

Regards

Pavel


> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Marko Tiikkaja

On 2015-11-22 18:30, Jim Nasby wrote:

On 11/21/15 12:49 AM, Pavel Stehule wrote:


I propose inversion function to pg_size_pretty function - like
pg_human_size.

Usage:

SELECT * FROM pg_class
WHERE pg_table_size(oid) > pg_human_size('2GB');


I'm not a big fan of the name, but +1 on the general idea.

Maybe pg_size_pretty(text)?


pg_ytterp_ezis(text)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Jim Nasby

On 11/22/15 2:11 PM, Pavel Stehule wrote:

What about pg_size(text), pg_size(value bigint, unit text) ?


I like, though I'd make it numeric or float. pg_size(3.5, 'GB') 
certainly seems like a reasonable use case...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Pavel Stehule
2015-11-22 21:19 GMT+01:00 Jim Nasby :

> On 11/22/15 2:11 PM, Pavel Stehule wrote:
>
>> What about pg_size(text), pg_size(value bigint, unit text) ?
>>
>
> I like, though I'd make it numeric or float. pg_size(3.5, 'GB') certainly
> seems like a reasonable use case...


yes, good note.

Regards

Pavel


>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-22 Thread Jim Nasby

On 11/21/15 12:49 AM, Pavel Stehule wrote:


I propose inversion function to pg_size_pretty function - like
pg_human_size.

Usage:

SELECT * FROM pg_class
   WHERE pg_table_size(oid) > pg_human_size('2GB');


I'm not a big fan of the name, but +1 on the general idea.

Maybe pg_size_pretty(text)?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] custom function for converting human readable sizes to bytes

2015-11-20 Thread Pavel Stehule
Hi

I have to write some filters, and filtering by size is "unfriendly" due
calculation in bytes.

I propose inversion function to pg_size_pretty function - like
pg_human_size.

Usage:

SELECT * FROM pg_class
  WHERE pg_table_size(oid) > pg_human_size('2GB');


Ideas, comments?

Regards

Pavel