Re: [HACKERS] Re: BIT/BIT VARYING status

Adriaan Joubert Sun, 05 Nov 2000 10:47:10 -0800

Peter,

        I've looked at the current implementation of the bit types and still
have some doubts concerning the following issues:

1. Constants. The current behaviour just seems somewhat strange, and I
have no idea where to fix it.

test=# select B'1001';
 ?column? 
----------
 X9
(1 row)

test=# select B'1001'::bit;
ERROR:  Cannot cast this expression to type 'bit'
test=# select B'1001'::varbit;
ERROR:  Cannot cast this expression to type 'varbit'
test=# select 'B1001'::varbit;
 ?column? 
----------
 B1001
(1 row)

test=# select 'B1001'::bit;
 ?column? 
----------
 X9
(1 row)

test=# select X'1001'::varbit;
ERROR:  varbit_in: The bit string 4097 must start with B or X
test=# select 'X1001'::varbit;
     ?column?      
-------------------
 B0001000000000001
(1 row)

test=# select 'X1001'::bit;
 ?column? 
----------
 X1001
(1 row)

test=# select X'1001'::bit;
ERROR:  zpbit_in: The bit string 4097 must start with B or X

Also, I have two output routines, that have been renames to zpbit_out
and varbit_out. In fact, both will work just fine for bot bit and
varbit, but the first prints as hex and the second as a bit string.
Printing as hex is more compact, so good for long strings, but printing
as a bit string is much more intuitive. One solution would be to make
them both print to a bit string by default and define a function to
generate a hex string. Another would be to have this under control of a
variable. Most people who contacted me about bit strings seemed to want
to use them for flags, so I guess the default should be to print them as
a bit string.

More for my information, if a user does not know about varbit, how does
he cast to bit varying? 

2. This is not a problem, more a question. There is no default way to
compare bit to varbit, as in 

test=# select 'b10'::bit='b10'::varbit;
ERROR:  Unable to identify an operator '=' for types 'bit' and 'varbit'
        You will have to retype this query using an explicit cast

This may be a good thing, as the comparison does depend on the lenght of
the bit strings.

3. The ^ operator seems to attempt to coerce the arguments to float8?

select 'B110011'::bit ^ 'B011101'::bit;
ERROR:  Function 'float8(bit)' does not exist
        Unable to identify a function that satisfies the given argument
types
        You may need to add explicit typecasts

4. This is a policy question. When I use the bit shift operator, this
always shifts within the current string only. So if I do

select ('B010'::bit(6) >> 2)::varbit;
 ?column?  
-----------
 B000100

I get what I would expect. But if I have a bit varying(6) field (in a
table, this is just an example), I only get

select ('B010'::varbit >> 2)::varbit;
 ?column?  
-----------
 B000

which I find counter-intuitive. I have thus added 'zpshiftright' and
'varbitshiftright' functions. The second extends the bitstring to the
right, while the first is the old bitshiftright function. I find this
more intuitive at least. 

Question is what a shift left function should do? Should I shorten the
string in the case of a shift left, to keep it symmetrical to shift
right? This seems a pure policy decision, as there are arguments for
both behaviours, although I am a great fan of symmetry. Let me know and
I can implement a separate function.


I have made a start on a file for regression tests, which I append with
the diffs for the varbit files. Please let me know what else is needed
and where I can help.


Thanks!

Adriaan

--
-- BIT types
--

--
-- Build tables for testing
--

CREATE TABLE ZPBIT_TABLE(b BIT(11));

INSERT INTO ZPBIT_TABLE VALUES ('B');

INSERT INTO ZPBIT_TABLE VALUES ('B0');

INSERT INTO ZPBIT_TABLE VALUES ('B010101');

INSERT INTO ZPBIT_TABLE VALUES ('B01010101010');

INSERT INTO ZPBIT_TABLE VALUES ('B010101010101');

INSERT INTO ZPBIT_TABLE VALUES ('X554');

INSERT INTO ZPBIT_TABLE VALUES ('X555');

SELECT * FROM ZPBIT_TABLE; 

CREATE TABLE VARBIT_TABLE(v BIT VARYING(11));

INSERT INTO VARBIT_TABLE VALUES ('B');

INSERT INTO VARBIT_TABLE VALUES ('B0');

INSERT INTO VARBIT_TABLE VALUES ('B010101');

INSERT INTO VARBIT_TABLE VALUES ('B01010101010');

INSERT INTO VARBIT_TABLE VALUES ('B010101010101');

INSERT INTO VARBIT_TABLE VALUES ('X554');

INSERT INTO VARBIT_TABLE VALUES ('X555');

SELECT * FROM VARBIT_TABLE; 

-- Delete from tables
DROP TABLE ZPBIT_TABLE;
CREATE TABLE ZPBIT_TABLE(b BIT(16));

INSERT INTO ZPBIT_TABLE VALUES ('B11011');
INSERT INTO ZPBIT_TABLE SELECT b>>1 FROM ZPBIT_TABLE;
INSERT INTO ZPBIT_TABLE SELECT b>>2 FROM ZPBIT_TABLE;
INSERT INTO ZPBIT_TABLE SELECT b>>4 FROM ZPBIT_TABLE;
INSERT INTO ZPBIT_TABLE SELECT b>>8 FROM ZPBIT_TABLE;
SELECT POSITION('B1101'::bit IN b) as pos, 
       POSITION('B11011'::bit IN b) as pos,
       b 
       FROM ZPBIT_TABLE ;

DROP TABLE VARBIT_TABLE;
CREATE TABLE VARBIT_TABLE(v BIT VARYING(19));
INSERT INTO VARBIT_TABLE VALUES ('B11011');
INSERT INTO VARBIT_TABLE SELECT v>>1 FROM VARBIT_TABLE;
INSERT INTO VARBIT_TABLE SELECT v>>2 FROM VARBIT_TABLE;
INSERT INTO VARBIT_TABLE SELECT v>>4 FROM VARBIT_TABLE;
INSERT INTO VARBIT_TABLE SELECT v>>8 FROM VARBIT_TABLE;
SELECT POSITION('B1101'::bit IN v) as pos, 
       POSITION('B11011'::bit IN v) as pos,
       v 
       FROM VARBIT_TABLE ;

-- Concatenation
SELECT v, b::varbit AS B, (v || b)::varbit AS C 
       FROM ZPBIT_TABLE, VARBIT_TABLE 
       WHERE v::bit(16)=b
       ORDER BY C;

-- Length
SELECT b, length(b) AS lb
       FROM ZPBIT_TABLE;
SELECT v, length(v) AS lv
       FROM VARBIT_TABLE;

-- Substring
SELECT b::varbit,
       SUBSTRING(b FROM 2 FOR 4)::varbit AS sub1,
       SUBSTRING(b FROM 7 FOR 13)::varbit AS sub2,
       SUBSTRING(b FROM 6)::varbit AS sub3
       FROM ZPBIT_TABLE;
SELECT v,
       SUBSTRING(v FROM 2 FOR 4)::varbit AS sub1,
       SUBSTRING(v FROM 7 FOR 13)::varbit AS sub2,
       SUBSTRING(v FROM 6)::varbit AS sub3
       FROM VARBIT_TABLE;

-- Drop the tables
DROP TABLE ZPBIT_TABLE;
DROP TABLE VARBIT_TABLE;

--- Bit operations
CREATE TABLE varbit_table (a BIT VARYING(16), b BIT VARYING(16));
COPY varbit_table FROM stdin;
X0F     X10
X1F     X11
X2F     X12
X3F     X13
X8F     X04
X000F   X0010
X0123   XFFFF
X2468   X2468
XFA50   X05AF
X1234   XFFF5
\.

SELECT a,b,~a AS "~ a",a & b AS "a & b", 
        a|b AS "a | b", a^b AS "a ^ b" FROM varbit_table;
SELECT a,b,a<b AS "a<b",a<=b AS "a<=b",a=b AS "a=b",
        a>=b AS "a>=b",a>b AS "a>b",a<=>b AS "a<=>b" FROM varbit_table;
SELECT a,a<<4 AS "a<<4",b,b>>2 AS "b>>2" FROM varbit_table;

DROP TABLE varbit_table;

--- Bit operations
CREATE TABLE zpbit (a BIT(16), b BIT(16));
COPY zpbit FROM stdin;
X0F     X10
X1F     X11
X2F     X12
X3F     X13
X8F     X04
X000F   X0010
X0123   XFFFF
X2468   X2468
XFA50   X05AF
X1234   XFFF5
\.

SELECT a,b,~a AS "~ a",a & b AS "a & b", 
        a|b AS "a | b", a^b AS "a ^ b" FROM zpbit;
SELECT a,b,a<b AS "a<b",a<=b AS "a<=b",a=b AS "a=b",
        a>=b AS "a>=b",a>b AS "a>b",a<>b AS "a<>b" FROM zpbit;
SELECT a,a<<4 AS "a<<4",b,b>>2 AS "b>>2" FROM zpbit;

DROP TABLE zpbit;


-- The following should fail
select 'X123'::bit & 'X12'::bit;
select 'B0111'::bit | 'B011'::bit;
select 'X023'::bit ^ 'B011101'::bit;

-- More position tests, checking all the boundary cases
SELECT POSITION('B1010'::bit IN 'B0000101'::bit);   -- 0
SELECT POSITION('B1010'::bit IN 'B00001010'::bit);  -- 5
SELECT POSITION('B1010'::bit IN 'B00000101'::bit);  -- 0
SELECT POSITION('B1010'::bit IN 'B000001010'::bit);  -- 6

SELECT POSITION('B'::bit IN 'B00001010'::bit);  -- 1
SELECT POSITION('B0'::bit IN 'B'::bit);  -- 0
SELECT POSITION('B'::bit IN 'B'::bit);  -- 0
SELECT POSITION('B101101'::bit IN 'B001011011011011000'::bit);  -- 3
SELECT POSITION('B10110110'::bit IN 'B001011011011010'::bit);  -- 3
SELECT POSITION('B1011011011011'::bit IN 'B001011011011011'::bit);  -- 3
SELECT POSITION('B1011011011011'::bit IN 'B00001011011011011'::bit);  -- 5

SELECT POSITION('B11101011'::bit IN 'B11101011'::bit); -- 1
SELECT POSITION('B11101011'::bit IN 'B011101011'::bit); -- 2
SELECT POSITION('B11101011'::bit IN 'B00011101011'::bit); -- 4
SELECT POSITION('B11101011'::bit IN 'B0000011101011'::bit); -- 6

SELECT POSITION('B111010110'::bit IN 'B111010110'::bit); -- 1
SELECT POSITION('B111010110'::bit IN 'B0111010110'::bit); -- 2
SELECT POSITION('B111010110'::bit IN 'B000111010110'::bit); -- 4
SELECT POSITION('B111010110'::bit IN 'B00000111010110'::bit); -- 6

SELECT POSITION('B111010110'::bit IN 'B11101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B011101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B00011101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B0000011101011'::bit); -- 0

SELECT POSITION('B111010110'::bit IN 'B111010110'::bit); -- 1
SELECT POSITION('B111010110'::bit IN 'B0111010110'::bit); -- 2
SELECT POSITION('B111010110'::bit IN 'B000111010110'::bit); -- 4
SELECT POSITION('B111010110'::bit IN 'B00000111010110'::bit); -- 6

SELECT POSITION('B111010110'::bit IN 'B000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B0000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B000000001110101111101011'::bit); -- 0
SELECT POSITION('B111010110'::bit IN 'B00000000001110101111101011'::bit); -- 0

SELECT POSITION('B111010110'::bit IN 'B0000011101011111010110'::bit); -- 14
SELECT POSITION('B111010110'::bit IN 'B00000011101011111010110'::bit); -- 15
SELECT POSITION('B111010110'::bit IN 'B0000000011101011111010110'::bit); -- 17
SELECT POSITION('B111010110'::bit IN 'B000000000011101011111010110'::bit); -- 19

SELECT POSITION('B000000000011101011111010110'::bit IN 
'B000000000011101011111010110'::bit); -- 1
SELECT POSITION('B00000000011101011111010110'::bit IN 
'B000000000011101011111010110'::bit); -- 2
SELECT POSITION('B0000000000011101011111010110'::bit IN 
'B000000000011101011111010110'::bit); -- 0

*** src/backend/utils/adt/varbit.c.old  Sun Nov  5 12:03:04 2000
--- src/backend/utils/adt/varbit.c      Sun Nov  5 19:12:10 2000
***************
*** 73,79 ****
                bit_not_hex = false;
        else
        {
!               elog(ERROR, "zpbit_in: %s is not a valid bitstring", s);
                bit_not_hex = false;    /* keep compiler quiet */
        }
  
--- 73,79 ----
                bit_not_hex = false;
        else
        {
!               elog(ERROR, "zpbit_in: The bit string %s must start with B or X", s);
                bit_not_hex = false;    /* keep compiler quiet */
        }
  
***************
*** 299,305 ****
                bit_not_hex = false;
        else
        {
!               elog(ERROR, "varbit_in: %s is not a valid bitstring", s);
                bit_not_hex = false;    /* keep compiler quiet */
        }
  
--- 299,305 ----
                bit_not_hex = false;
        else
        {
!               elog(ERROR, "varbit_in: The bit string %s must start with B or X", s);
                bit_not_hex = false;    /* keep compiler quiet */
        }
  
***************
*** 744,749 ****
--- 744,752 ----
                           *ps;
  
        bitlen = VARBITLEN(arg);
+       /* Do we have an upper bound? */
+       if (l==-1)
+               l = bitlen;
        e = s + l;
        s1 = Max(s, 1);
        e1 = Min(e, bitlen + 1);
***************
*** 1039,1045 ****
   * do a right shift (i.e. towards the end of the string)
   */
  Datum
! bitshiftright(PG_FUNCTION_ARGS)
  {
        VarBit     *arg = PG_GETARG_VARBIT_P(0);
        int32           shft = PG_GETARG_INT32(1);
--- 1042,1048 ----
   * do a right shift (i.e. towards the end of the string)
   */
  Datum
! zpbitshiftright(PG_FUNCTION_ARGS)
  {
        VarBit     *arg = PG_GETARG_VARBIT_P(0);
        int32           shft = PG_GETARG_INT32(1);
***************
*** 1097,1102 ****
--- 1100,1166 ----
        PG_RETURN_VARBIT_P(result);
  }
  
+ /* bitshiftright
+  * do a right shift (i.e. towards the end of the string)
+  */
+ Datum
+ /*varbitshiftright(PG_FUNCTION_ARGS)*/
+ bitshiftright(PG_FUNCTION_ARGS)
+ {
+       VarBit     *arg = PG_GETARG_VARBIT_P(0);
+       int32           shft = PG_GETARG_INT32(1);
+       VarBit     *result;
+       int                     byte_shift,
+                               byte_len,
+                               ishift,
+                               len;
+       bits8      *p,
+                          *r;
+ 
+       /* Negative shift is a shift to the left */
+       if (shft < 0)
+               PG_RETURN_DATUM(DirectFunctionCall2(bitshiftleft,
+                                                   VarBitPGetDatum(arg),
+                                                   Int32GetDatum(-shft)));
+ 
+       /* When we have a varying bit string, the string may get longer */
+       len = VARBITLEN(arg) + shft;
+       byte_len = VARBITTOTALLEN(len);
+ 
+       result = (VarBit *) palloc(byte_len);
+       VARATT_SIZEP(result) = byte_len;
+       VARBITLEN(result) = len;
+       r = VARBITS(result);
+ 
+       byte_shift = shft / BITS_PER_BYTE;
+       ishift = shft % BITS_PER_BYTE;
+       p = VARBITS(arg);
+ 
+       /* Set the first part of the result to 0 */
+       memset(r, 0, byte_shift);
+       r += byte_shift;
+ 
+       if (ishift == 0)
+       {
+               /* Special case: we can do a memcpy */
+               len = VARBITBYTES(arg);
+               memcpy(r, p, len);
+       }
+       else
+       {
+               if (r < VARBITEND(result))
+                       *r = 0;                         /* initialize first byte */
+               for (; r < VARBITEND(result); p++)
+               {
+                       *r |= *p >> ishift;
+                       if ((++r) < VARBITEND(result))
+                               *r = (*p << (BITS_PER_BYTE - ishift)) & BITMASK;
+               }
+       }
+ 
+       PG_RETURN_VARBIT_P(result);
+ }
+ 
  /* This is not defined in any standard. We retain the natural ordering of
   * bits here, as it just seems more intuitive. 
   */
***************
*** 1216,1224 ****
                                p++;
                                if (p == VARBITEND(arg)) {
                                        mask2 = end_mask << (BITS_PER_BYTE - is);
!                                       is_match = mask2 == 0;
!                                       elog(NOTICE,"S. %d %d em=%2x sm=%2x r=%d",
!                                                i,is,end_mask,mask2,is_match);
                                        break;
                                }
                                cmp = *s << (BITS_PER_BYTE - is);
--- 1280,1286 ----
                                p++;
                                if (p == VARBITEND(arg)) {
                                        mask2 = end_mask << (BITS_PER_BYTE - is);
!                                       is_match = (mask2 == 0);
                                        break;
                                }
                                cmp = *s << (BITS_PER_BYTE - is);

*** src/include/utils/varbit.h.old      Sun Nov  5 20:20:40 2000
--- src/include/utils/varbit.h  Sun Nov  5 19:04:16 2000
***************
*** 80,86 ****
--- 80,88 ----
  extern Datum bitxor(PG_FUNCTION_ARGS);
  extern Datum bitnot(PG_FUNCTION_ARGS);
  extern Datum bitshiftleft(PG_FUNCTION_ARGS);
+ extern Datum zpbitshiftright(PG_FUNCTION_ARGS);
  extern Datum bitshiftright(PG_FUNCTION_ARGS);
+ /*extern Datum varbitshiftright(PG_FUNCTION_ARGS);*/
  extern Datum bitcat(PG_FUNCTION_ARGS);
  extern Datum bitsubstr(PG_FUNCTION_ARGS);
  extern Datum bitlength(PG_FUNCTION_ARGS);

Re: [HACKERS] Re: BIT/BIT VARYING status

Reply via email to