>Synopsis:      Wrong result in right and full outer join with join(1)
>Category:      system
>Environment:
        System      : OpenBSD 6.3
        Details     : OpenBSD 6.3 (GENERIC.MP) #11: Thu Sep 20 16:05:37 CEST 
2018
                         
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP

        Architecture: OpenBSD.amd64
        Machine     : amd64
>Description:

The join(1) utility produces the wrong result when performing a right
join or a full outer join between two files under some circumstances.

>How-To-Repeat:

$ cat test1.input
1,A1,B1
2
3,E1,F1

(it makes no difference if the second line reads "2,," or "2,C1," or
"2,C1,D1" above)

$ cat test2.input
1,A2,B2
3,E2,F2

(none of the files has an empty line at the end, that's only added for
readability here)

Doing a left join (using -a1) works as expected:
$ join -t, -o0,1.2,1.3,2.2,2.3 -a1 test1.input test2.input
1,A1,B1,A2,B2
2,,,,
3,E1,F1,E2,F2

Doing a right join (using -a2):
$ join -t, -o0,1.2,1.3,2.2,2.3 -a2 test1.input test2.input
1,A1,B1,A2,B2
3,,,E2,F2

The data from the first file is dropped even though there should be a
match on its "3" line.

The expected result is
1,A1,B1,A2,B2
3,E1,F1,E2,F2

Doing a full outer join:
$ join -t, -o0,1.2,1.3,2.2,2.3 -a1 -a2 test1.input test2.input
1,A1,B1,A2,B2
2,,,,
3,,,E2,F2
3,E1,F1,,

Again, the "3" line in the two files don't appear to get matched up.

The expected result is
1,A1,B1,A2,B2
2,,,,
3,E1,F1,E2,F2

Doing the join with the two files swapped around works in all cases:

Left:
$ join -t, -o0,1.2,1.3,2.2,2.3 -a1 test2.input test1.input
1,A2,B2,A1,B1
3,E2,F2,E1,F1

Right:
$ join -t, -o0,1.2,1.3,2.2,2.3 -a2 test2.input test1.input
1,A2,B2,A1,B1
2,,,,
3,E2,F2,E1,F1

Full outer:
$ join -t, -o0,1.2,1.3,2.2,2.3 -a1 -a2 test2.input test1.input
1,A2,B2,A1,B1
2,,,,
3,E2,F2,E1,F1

Also, inner joins (without using -a at all) works as it should.

$ hexdump -C test1.input
00000000  31 2c 41 31 2c 42 31 0a  32 2c 43 31 2c 44 31 0a  |1,A1,B1.2,C1,D1.|
00000010  33 2c 45 31 2c 46 31 0a                           |3,E1,F1.|
00000018
$ hexdump -C test2.input
00000000  31 2c 41 32 2c 42 32 0a  33 2c 45 32 2c 46 32 0a  |1,A2,B2.3,E2,F2.|
00000010


>Fix:

Currently unknown.

dmesg:
OpenBSD 6.3 (GENERIC.MP) #11: Thu Sep 20 16:05:37 CEST 2018
    
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1056899072 (1007MB)
avail mem = 1017843712 (970MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xe1000 (10 entries)
bios0: vendor innotek GmbH version "VirtualBox" date 12/01/2006
bios0: innotek GmbH VirtualBox
acpi0 at bios0: rev 2
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP APIC HPET MCFG SSDT
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz, 3193.05 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,AVX2,INVPCID,L1DF,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
acpitimer0: recalibrated TSC frequency 3192606670 Hz
cpu0: smt 0, core 0, package 0
mtrr: CPU supports MTRRs but not enabled by BIOS
cpu0: apic clock running at 1000MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz, 3207.73 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,AVX2,INVPCID,L1DF,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins
, remapped to apid 2
acpihpet0 at acpi0: 14318179 Hz
acpihpet0: recalibrated TSC frequency 3192518130 Hz
acpimcfg0 at acpi0 addr 0xdc000000, bus 0-63
acpiprt0 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpiac0 at acpi0: AC unit online
acpivideo0 at acpi0: GFX0
pvbus0 at mainbus0: KVM
pci0 at mainbus0 bus 0
vga1 at pci0 dev 2 function 0 "InnoTek VirtualBox Graphics Adapter" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
virtio0 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
vio0 at virtio0: address 08:00:27:4c:c3:b6
virtio0: apic 2 int 19
"InnoTek VirtualBox Guest Service" rev 0x00 at pci0 dev 4 function 0 not 
configured
piixpm0 at pci0 dev 7 function 0 "Intel 82371AB Power" rev 0x08: apic 2 int 23
iic0 at piixpm0
virtio1 at pci0 dev 8 function 0 "Qumranet Virtio Network" rev 0x00
vio1 at virtio1: address 08:00:27:56:79:c2
virtio1: apic 2 int 16
pcib0 at pci0 dev 31 function 0 "Intel 82801GBM LPC" rev 0x02
pciide0 at pci0 dev 31 function 1 "Intel 6321ESB IDE" rev 0x00: DMA, channel 0 
configured to compatibility, channel 1 configured to compatibility
wd0 at pciide0 channel 0 drive 0: <VBOX HARDDISK>
wd0: 128-sector PIO, LBA, 102400MB, 209715200 sectors
wd1 at pciide0 channel 0 drive 1: <VBOX HARDDISK>
wd1: 128-sector PIO, LBA, 5120MB, 10485760 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <VBOX, CD-ROM, 1.0> ATAPI 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 5
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on wd0a (411f6d6665fa8a05.a) swap on wd0b dump on wd0b

usbdevs:
usbdevs: no USB controllers found

--
Andreas Kusalananda Kähäri,
National Bioinformatics Infrastructure Sweden (NBIS),
Uppsala University, Sweden.








När du har kontakt med oss på Uppsala universitet med e-post så innebär det att 
vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du 
läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For 
more information on how this is performed, please read here: 
http://www.uu.se/om-uu/dataskydd-personuppgifter/

Reply via email to