Peter,
I've looked at the current implementation of the bit types and still
have some doubts concerning the following issues:
1. Constants. The current behaviour just seems somewhat strange, and I
have no idea where to fix it.
test=# select B'1001';
?column?
----------
X9
(1 row)
test=# select B'1001'::bit;
ERROR: Cannot cast this expression to type 'bit'
test=# select B'1001'::varbit;
ERROR: Cannot cast this expression to type 'varbit'
test=# select 'B1001'::varbit;
?column?
----------
B1001
(1 row)
test=# select 'B1001'::bit;
?column?
----------
X9
(1 row)
test=# select X'1001'::varbit;
ERROR: varbit_in: The bit string 4097 must start with B or X
test=# select 'X1001'::varbit;
?column?
-------------------
B0001000000000001
(1 row)
test=# select 'X1001'::bit;
?column?
----------
X1001
(1 row)
test=# select X'1001'::bit;
ERROR: zpbit_in: The bit string 4097 must start with B or X
Also, I have two output routines, that have been renames to zpbit_out
and varbit_out. In fact, both will work just fine for bot bit and
varbit, but the first prints as hex and the second as a bit string.
Printing as hex is more compact, so good for long strings, but printing
as a bit string is much more intuitive. One solution would be to make
them both print to a bit string by default and define a function to
generate a hex string. Another would be to have this under control of a
variable. Most people who contacted me about bit strings seemed to want
to use them for flags, so I guess the default should be to print them as
a bit string.
More for my information, if a user does not know about varbit, how does
he cast to bit varying?
2. This is not a problem, more a question. There is no default way to
compare bit to varbit, as in
test=# select 'b10'::bit='b10'::varbit;
ERROR: Unable to identify an operator '=' for types 'bit' and 'varbit'
You will have to retype this query using an explicit cast
This may be a good thing, as the comparison does depend on the lenght of
the bit strings.
3. The ^ operator seems to attempt to coerce the arguments to float8?
select 'B110011'::bit ^ 'B011101'::bit;
ERROR: Function 'float8(bit)' does not exist
Unable to identify a function that satisfies the given argument
types
You may need to add explicit typecasts
4. This is a policy question. When I use the bit shift operator, this
always shifts within the current string only. So if I do
select ('B010'::bit(6) >> 2)::varbit;
?column?
-----------
B000100
I get what I would expect. But if I have a bit varying(6) field (in a
table, this is just an example), I only get
select ('B010'::varbit >> 2)::varbit;
?column?
-----------
B000
which I find counter-intuitive. I have thus added 'zpshiftright' and
'varbitshiftright' functions. The second extends the bitstring to the
right, while the first is the old bitshiftright function. I find this
more intuitive at least.
Question is what a shift left function should do? Should I shorten the
string in the case of a shift left, to keep it symmetrical to shift
right? This seems a pure policy decision, as there are arguments for
both behaviours, although I am a great fan of symmetry. Let me know and
I can implement a separate function.
I have made a start on a file for regression tests, which I append with
the diffs for the varbit files. Please let me know what else is needed
and where I can help.
Thanks!
Adriaan
--
-- BIT types
--
--
-- Build tables for testing
--
CREATE TABLE ZPBIT_TABLE(b BIT(11));
INSERT INTO ZPBIT_TABLE VALUES ('B');
INSERT INTO ZPBIT_TABLE VALUES ('B0');
INSERT INTO ZPBIT_TABLE VALUES ('B010101');
INSERT INTO ZPBIT_TABLE VALUES ('B01010101010');
INSERT INTO ZPBIT_TABLE VALUES ('B010101010101');
INSERT INTO ZPBIT_TABLE VALUES ('X554');
INSERT INTO ZPBIT_TABLE VALUES ('X555');
SELECT * FROM ZPBIT_TABLE;
CREATE TABLE VARBIT_TABLE(v BIT VARYING(11));
INSERT INTO VARBIT_TABLE VALUES ('B');
INSERT INTO VARBIT_TABLE VALUES ('B0');
INSERT INTO VARBIT_TABLE VALUES ('B010101');
INSERT INTO VARBIT_TABLE VALUES ('B01010101010');
INSERT INTO VARBIT_TABLE VALUES ('B010101010101');
INSERT INTO VARBIT_TABLE VALUES ('X554');
INSERT INTO VARBIT_TABLE VALUES ('X555');
SELECT * FROM VARBIT_TABLE;
-- Delete from tables
DROP TABLE ZPBIT_TABLE;
CREATE TABLE ZPBIT_TABLE(b BIT(16));
INSERT INTO ZPBIT_TABLE VALUES ('B11011');
INSERT INTO ZPBIT_TABLE SELECT b>>1 FROM ZPBIT_TABLE;
INSERT INTO ZPBIT_TABLE SELECT b>>2 FROM ZPBIT_TABLE;
INSERT INTO ZPBIT_TABLE SELECT b>>4 FROM ZPBIT_TABLE;
INSERT INTO ZPBIT_TABLE SELECT b>>8 FROM ZPBIT_TABLE;
SELECT POSITION('B1101'::bit IN b) as pos,
POSITION('B11011'::bit IN b) as pos,
b
FROM ZPBIT_TABLE ;
DROP TABLE VARBIT_TABLE;
CREATE TABLE VARBIT_TABLE(v BIT VARYING(19));
INSERT INTO VARBIT_TABLE VALUES ('B11011');
INSERT INTO VARBIT_TABLE SELECT v>>1 FROM VARBIT_TABLE;
INSERT INTO VARBIT_TABLE SELECT v>>2 FROM VARBIT_TABLE;
INSERT INTO VARBIT_TABLE SELECT v>>4 FROM VARBIT_TABLE;
INSERT INTO VARBIT_TABLE SELECT v>>8 FROM VARBIT_TABLE;
SELECT POSITION('B1101'::bit IN v) as pos,
POSITION('B11011'::bit IN v) as pos,
v
FROM VARBIT_TABLE ;
-- Concatenation
SELECT v, b::varbit AS B, (v || b)::varbit AS C
FROM ZPBIT_TABLE, VARBIT_TABLE
WHERE v::bit(16)=b
ORDER BY C;
-- Length
SELECT b, length(b) AS lb
FROM ZPBIT_TABLE;
SELECT v, length(v) AS lv
FROM VARBIT_TABLE;
-- Substring
SELECT b::varbit,
SUBSTRING(b FROM 2 FOR 4)::varbit AS sub1,
SUBSTRING(b FROM 7 FOR 13)::varbit AS sub2,
SUBSTRING(b FROM 6)::varbit AS sub3
FROM ZPBIT_TABLE;
SELECT v,
SUBSTRING(v FROM 2 FOR 4)::varbit AS sub1,
SUBSTRING(v FROM 7 FOR 13)::varbit AS sub2,
SUBSTRING(v FROM 6)::varbit AS sub3
FROM VARBIT_TABLE;
-- Drop the tables
DROP TABLE ZPBIT_TABLE;
DROP TABLE VARBIT_TABLE;
--- Bit operations
CREATE TABLE varbit_table (a BIT VARYING(16), b BIT VARYING(16));
COPY varbit_table FROM stdin;
X0F X10
X1F X11
X2F X12
X3F X13
X8F X04
X000F X0010
X0123 XFFFF
X2468 X2468
XFA50 X05AF
X1234 XFFF5
\.
SELECT a,b,~a AS "~ a",a & b AS "a & b",
a|b AS "a | b", a^b AS "a ^ b" FROM varbit_table;
SELECT a,b,a<b AS "a<b",a<=b AS "a<=b",a=b AS "a=b",
a>=b AS "a>=b",a>b AS "a>b",a<=>b AS "a<=>b" FROM varbit_table;
SELECT a,a<<4 AS "a<<4",b,b>>2 AS "b>>2" FROM varbit_table;
DROP TABLE varbit_table;
--- Bit operations
CREATE TABLE zpbit (a BIT(16), b BIT(16));
COPY zpbit FROM stdin;
X0F X10
X1F X11
X2F X12
X3F X13
X8F X04
X000F X0010
X0123 XFFFF
X2468 X2468
XFA50 X05AF
X1234 XFFF5
\.
SELECT a,b,~a AS "~ a",a & b AS "a & b",
a|b AS "a | b", a^b AS "a ^ b" FROM zpbit;
SELECT a,b,a<b AS "a<b",a<=b AS "a<=b",a=b AS "a=b",
a>=b AS "a>=b",a>b AS "a>b",a<>b AS "a<>b" FROM zpbit;
SELECT a,a<<4 AS "a<<4",b,b>>2 AS "b>>2" FROM zpbit;
DROP TABLE zpbit;
-- The following should fail
select 'X123'::bit & 'X12'::bit;
select 'B0111'::bit | 'B011'::bit;
select 'X023'::bit ^ 'B011101'::bit;
-- More position tests, checking all the boundary cases
SELECT POSITION('B1010'::bit IN 'B0000101'::bit); -- 0
SELECT POSITION('B1010'::bit IN 'B00001010'::bit); -- 5
SELECT POSITION('B1010'::bit IN 'B00000101'::bit); -- 0
SELECT POSITION('B1010'::bit IN 'B000001010'::bit); -- 6
SELECT POSITION('B'::bit IN 'B00001010'::bit); -- 1
SELECT POSITION('B0'::bit IN 'B'::bit); -- 0
SELECT POSITION('B'::bit IN 'B'::bit); -- 0
SELECT POSITION('B101101'::bit IN 'B001011011011011000'::bit); -- 3
SELECT POSITION('B10110110'::bit IN 'B001011011011010'::bit); -- 3
SELECT POSITION('B1011011011011'::bit IN 'B001011011011011'::bit); -- 3
SELECT POSITION('B1011011011011'::bit IN 'B00001011011011011'::bit); -- 5
SELECT POSITION('B11101011'::bit IN 'B11101011'::bit); -- 1
SELECT POSITION('B11101011'::bit IN 'B011101011'::bit); -- 2
SELECT POSITION('B11101011'::bit IN 'B00011101011'::bit); -- 4
SELECT POSITION('B11101011'::bit IN 'B0000011101011'::bit); -- 6
SELECT POSITION('B111010110'::bit IN 'B111010110'::bit); -- 1
SELECT POSITION('B111010110'::bit IN 'B0111010110'::bit); -- 2
SELECT POSITION('B111010110'::bit IN 'B000111010110'::bit); -- 4
SELECT POSITION('B111010110'::bit IN 'B00000111010110'::bit); -- 6
SELECT POSITION('B111010110'::bit IN 'B11101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B011101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B00011101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B0000011101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B111010110'::bit); -- 1
SELECT POSITION('B111010110'::bit IN 'B0111010110'::bit); -- 2
SELECT POSITION('B111010110'::bit IN 'B000111010110'::bit); -- 4
SELECT POSITION('B111010110'::bit IN 'B00000111010110'::bit); -- 6
SELECT POSITION('B111010110'::bit IN 'B000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B0000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B000000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B00000000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B0000011101011111010110'::bit); -- 14
SELECT POSITION('B111010110'::bit IN 'B00000011101011111010110'::bit); -- 15
SELECT POSITION('B111010110'::bit IN 'B0000000011101011111010110'::bit); -- 17
SELECT POSITION('B111010110'::bit IN 'B000000000011101011111010110'::bit); -- 19
SELECT POSITION('B000000000011101011111010110'::bit IN
'B000000000011101011111010110'::bit); -- 1
SELECT POSITION('B00000000011101011111010110'::bit IN
'B000000000011101011111010110'::bit); -- 2
SELECT POSITION('B0000000000011101011111010110'::bit IN
'B000000000011101011111010110'::bit); -- 0
*** src/backend/utils/adt/varbit.c.old Sun Nov 5 12:03:04 2000
--- src/backend/utils/adt/varbit.c Sun Nov 5 19:12:10 2000
***************
*** 73,79 ****
bit_not_hex = false;
else
{
! elog(ERROR, "zpbit_in: %s is not a valid bitstring", s);
bit_not_hex = false; /* keep compiler quiet */
}
--- 73,79 ----
bit_not_hex = false;
else
{
! elog(ERROR, "zpbit_in: The bit string %s must start with B or X", s);
bit_not_hex = false; /* keep compiler quiet */
}
***************
*** 299,305 ****
bit_not_hex = false;
else
{
! elog(ERROR, "varbit_in: %s is not a valid bitstring", s);
bit_not_hex = false; /* keep compiler quiet */
}
--- 299,305 ----
bit_not_hex = false;
else
{
! elog(ERROR, "varbit_in: The bit string %s must start with B or X", s);
bit_not_hex = false; /* keep compiler quiet */
}
***************
*** 744,749 ****
--- 744,752 ----
*ps;
bitlen = VARBITLEN(arg);
+ /* Do we have an upper bound? */
+ if (l==-1)
+ l = bitlen;
e = s + l;
s1 = Max(s, 1);
e1 = Min(e, bitlen + 1);
***************
*** 1039,1045 ****
* do a right shift (i.e. towards the end of the string)
*/
Datum
! bitshiftright(PG_FUNCTION_ARGS)
{
VarBit *arg = PG_GETARG_VARBIT_P(0);
int32 shft = PG_GETARG_INT32(1);
--- 1042,1048 ----
* do a right shift (i.e. towards the end of the string)
*/
Datum
! zpbitshiftright(PG_FUNCTION_ARGS)
{
VarBit *arg = PG_GETARG_VARBIT_P(0);
int32 shft = PG_GETARG_INT32(1);
***************
*** 1097,1102 ****
--- 1100,1166 ----
PG_RETURN_VARBIT_P(result);
}
+ /* bitshiftright
+ * do a right shift (i.e. towards the end of the string)
+ */
+ Datum
+ /*varbitshiftright(PG_FUNCTION_ARGS)*/
+ bitshiftright(PG_FUNCTION_ARGS)
+ {
+ VarBit *arg = PG_GETARG_VARBIT_P(0);
+ int32 shft = PG_GETARG_INT32(1);
+ VarBit *result;
+ int byte_shift,
+ byte_len,
+ ishift,
+ len;
+ bits8 *p,
+ *r;
+
+ /* Negative shift is a shift to the left */
+ if (shft < 0)
+ PG_RETURN_DATUM(DirectFunctionCall2(bitshiftleft,
+ VarBitPGetDatum(arg),
+ Int32GetDatum(-shft)));
+
+ /* When we have a varying bit string, the string may get longer */
+ len = VARBITLEN(arg) + shft;
+ byte_len = VARBITTOTALLEN(len);
+
+ result = (VarBit *) palloc(byte_len);
+ VARATT_SIZEP(result) = byte_len;
+ VARBITLEN(result) = len;
+ r = VARBITS(result);
+
+ byte_shift = shft / BITS_PER_BYTE;
+ ishift = shft % BITS_PER_BYTE;
+ p = VARBITS(arg);
+
+ /* Set the first part of the result to 0 */
+ memset(r, 0, byte_shift);
+ r += byte_shift;
+
+ if (ishift == 0)
+ {
+ /* Special case: we can do a memcpy */
+ len = VARBITBYTES(arg);
+ memcpy(r, p, len);
+ }
+ else
+ {
+ if (r < VARBITEND(result))
+ *r = 0; /* initialize first byte */
+ for (; r < VARBITEND(result); p++)
+ {
+ *r |= *p >> ishift;
+ if ((++r) < VARBITEND(result))
+ *r = (*p << (BITS_PER_BYTE - ishift)) & BITMASK;
+ }
+ }
+
+ PG_RETURN_VARBIT_P(result);
+ }
+
/* This is not defined in any standard. We retain the natural ordering of
* bits here, as it just seems more intuitive.
*/
***************
*** 1216,1224 ****
p++;
if (p == VARBITEND(arg)) {
mask2 = end_mask << (BITS_PER_BYTE - is);
! is_match = mask2 == 0;
! elog(NOTICE,"S. %d %d em=%2x sm=%2x r=%d",
! i,is,end_mask,mask2,is_match);
break;
}
cmp = *s << (BITS_PER_BYTE - is);
--- 1280,1286 ----
p++;
if (p == VARBITEND(arg)) {
mask2 = end_mask << (BITS_PER_BYTE - is);
! is_match = (mask2 == 0);
break;
}
cmp = *s << (BITS_PER_BYTE - is);
*** src/include/utils/varbit.h.old Sun Nov 5 20:20:40 2000
--- src/include/utils/varbit.h Sun Nov 5 19:04:16 2000
***************
*** 80,86 ****
--- 80,88 ----
extern Datum bitxor(PG_FUNCTION_ARGS);
extern Datum bitnot(PG_FUNCTION_ARGS);
extern Datum bitshiftleft(PG_FUNCTION_ARGS);
+ extern Datum zpbitshiftright(PG_FUNCTION_ARGS);
extern Datum bitshiftright(PG_FUNCTION_ARGS);
+ /*extern Datum varbitshiftright(PG_FUNCTION_ARGS);*/
extern Datum bitcat(PG_FUNCTION_ARGS);
extern Datum bitsubstr(PG_FUNCTION_ARGS);
extern Datum bitlength(PG_FUNCTION_ARGS);