Just to say that this is the observed behaviour on a recent snapshot as
well.  More specifically on

OpenBSD 6.4 (GENERIC.MP) #351: Sat Oct  6 01:06:47 MDT 2018
    [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP


Cheers,
Andreas


On Sat, Oct 06, 2018 at 08:57:35PM +0200, Andreas Kusalananda Kähäri wrote:
> >Synopsis:      Wrong result in right and full outer join with join(1)
> >Category:      system
> >Environment:
>         System      : OpenBSD 6.3
>         Details     : OpenBSD 6.3 (GENERIC.MP) #11: Thu Sep 20 16:05:37 CEST 
> 2018
>                          
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>         Architecture: OpenBSD.amd64
>         Machine     : amd64
> >Description:
> 
> The join(1) utility produces the wrong result when performing a right
> join or a full outer join between two files under some circumstances.
> 
> >How-To-Repeat:
> 
> $ cat test1.input
> 1,A1,B1
> 2
> 3,E1,F1
> 
> (it makes no difference if the second line reads "2,," or "2,C1," or
> "2,C1,D1" above)
> 
> $ cat test2.input
> 1,A2,B2
> 3,E2,F2
> 
> (none of the files has an empty line at the end, that's only added for
> readability here)
> 
> Doing a left join (using -a1) works as expected:
> $ join -t, -o0,1.2,1.3,2.2,2.3 -a1 test1.input test2.input
> 1,A1,B1,A2,B2
> 2,,,,
> 3,E1,F1,E2,F2
> 
> Doing a right join (using -a2):
> $ join -t, -o0,1.2,1.3,2.2,2.3 -a2 test1.input test2.input
> 1,A1,B1,A2,B2
> 3,,,E2,F2
> 
> The data from the first file is dropped even though there should be a
> match on its "3" line.
> 
> The expected result is
> 1,A1,B1,A2,B2
> 3,E1,F1,E2,F2
> 
> Doing a full outer join:
> $ join -t, -o0,1.2,1.3,2.2,2.3 -a1 -a2 test1.input test2.input
> 1,A1,B1,A2,B2
> 2,,,,
> 3,,,E2,F2
> 3,E1,F1,,
> 
> Again, the "3" line in the two files don't appear to get matched up.
> 
> The expected result is
> 1,A1,B1,A2,B2
> 2,,,,
> 3,E1,F1,E2,F2
> 
> Doing the join with the two files swapped around works in all cases:
> 
> Left:
> $ join -t, -o0,1.2,1.3,2.2,2.3 -a1 test2.input test1.input
> 1,A2,B2,A1,B1
> 3,E2,F2,E1,F1
> 
> Right:
> $ join -t, -o0,1.2,1.3,2.2,2.3 -a2 test2.input test1.input
> 1,A2,B2,A1,B1
> 2,,,,
> 3,E2,F2,E1,F1
> 
> Full outer:
> $ join -t, -o0,1.2,1.3,2.2,2.3 -a1 -a2 test2.input test1.input
> 1,A2,B2,A1,B1
> 2,,,,
> 3,E2,F2,E1,F1
> 
> Also, inner joins (without using -a at all) works as it should.
> 
> $ hexdump -C test1.input
> 00000000  31 2c 41 31 2c 42 31 0a  32 2c 43 31 2c 44 31 0a  |1,A1,B1.2,C1,D1.|
> 00000010  33 2c 45 31 2c 46 31 0a                           |3,E1,F1.|
> 00000018
> $ hexdump -C test2.input
> 00000000  31 2c 41 32 2c 42 32 0a  33 2c 45 32 2c 46 32 0a  |1,A2,B2.3,E2,F2.|
> 00000010
> 
> 
> >Fix:
> 
> Currently unknown.
> 
> dmesg:
> OpenBSD 6.3 (GENERIC.MP) #11: Thu Sep 20 16:05:37 CEST 2018
>     
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 1056899072 (1007MB)
> avail mem = 1017843712 (970MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xe1000 (10 entries)
> bios0: vendor innotek GmbH version "VirtualBox" date 12/01/2006
> bios0: innotek GmbH VirtualBox
> acpi0 at bios0: rev 2
> acpi0: sleep states S0 S5
> acpi0: tables DSDT FACP APIC HPET MCFG SSDT
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 32 bits
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz, 3193.05 MHz
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,AVX2,INVPCID,L1DF,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> acpitimer0: recalibrated TSC frequency 3192606670 Hz
> cpu0: smt 0, core 0, package 0
> mtrr: CPU supports MTRRs but not enabled by BIOS
> cpu0: apic clock running at 1000MHz
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz, 3207.73 MHz
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,RDRAND,HV,NXE,RDTSCP,LONG,LAHF,ABM,ITSC,FSGSBASE,AVX2,INVPCID,L1DF,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins
> , remapped to apid 2
> acpihpet0 at acpi0: 14318179 Hz
> acpihpet0: recalibrated TSC frequency 3192518130 Hz
> acpimcfg0 at acpi0 addr 0xdc000000, bus 0-63
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpicpu0 at acpi0: C1(@1 halt!)
> acpicpu1 at acpi0: C1(@1 halt!)
> acpiac0 at acpi0: AC unit online
> acpivideo0 at acpi0: GFX0
> pvbus0 at mainbus0: KVM
> pci0 at mainbus0 bus 0
> vga1 at pci0 dev 2 function 0 "InnoTek VirtualBox Graphics Adapter" rev 0x00
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> virtio0 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
> vio0 at virtio0: address 08:00:27:4c:c3:b6
> virtio0: apic 2 int 19
> "InnoTek VirtualBox Guest Service" rev 0x00 at pci0 dev 4 function 0 not 
> configured
> piixpm0 at pci0 dev 7 function 0 "Intel 82371AB Power" rev 0x08: apic 2 int 23
> iic0 at piixpm0
> virtio1 at pci0 dev 8 function 0 "Qumranet Virtio Network" rev 0x00
> vio1 at virtio1: address 08:00:27:56:79:c2
> virtio1: apic 2 int 16
> pcib0 at pci0 dev 31 function 0 "Intel 82801GBM LPC" rev 0x02
> pciide0 at pci0 dev 31 function 1 "Intel 6321ESB IDE" rev 0x00: DMA, channel 
> 0 configured to compatibility, channel 1 configured to compatibility
> wd0 at pciide0 channel 0 drive 0: <VBOX HARDDISK>
> wd0: 128-sector PIO, LBA, 102400MB, 209715200 sectors
> wd1 at pciide0 channel 0 drive 1: <VBOX HARDDISK>
> wd1: 128-sector PIO, LBA, 5120MB, 10485760 sectors
> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
> wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 5
> atapiscsi0 at pciide0 channel 1 drive 0
> scsibus1 at atapiscsi0: 2 targets
> cd0 at scsibus1 targ 0 lun 0: <VBOX, CD-ROM, 1.0> ATAPI 5/cdrom removable
> cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 5
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5 irq 1 irq 12
> pckbd0 at pckbc0 (kbd slot)
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pms0 at pckbc0 (aux slot)
> wsmouse0 at pms0 mux 0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> root on wd0a (411f6d6665fa8a05.a) swap on wd0b dump on wd0b
> 
> usbdevs:
> usbdevs: no USB controllers found
> 
> --
> Andreas Kusalananda Kähäri,
> National Bioinformatics Infrastructure Sweden (NBIS),
> Uppsala University, Sweden.
> 
> 
> 
> 
> 
> 
> 
> 
> När du har kontakt med oss på Uppsala universitet med e-post så innebär det 
> att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan 
> du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
> 
> E-mailing Uppsala University means that we will process your personal data. 
> For more information on how this is performed, please read here: 
> http://www.uu.se/om-uu/dataskydd-personuppgifter/
> 

-- 
Andreas Kusalananda Kähäri,
National Bioinformatics Infrastructure Sweden (NBIS),
Uppsala University, Sweden.

Reply via email to