Hi Igor, Thanks for packaging beignet for Fedora promptly. It helps promote latest beignet to normal users.
Thanks, Zhigang Gong. > -----Original Message----- > From: Beignet [mailto:[email protected]] On Behalf Of > Igor Gnatenko > Sent: Tuesday, November 18, 2014 4:12 AM > To: Zhigang Gong > Cc: michael.fu; Zou Nanhai; An open source open CL implemenation for Intel > platform > Subject: Re: [Beignet] [ANNOUNCE] Beignet 1.0.0 (2014-11-14) > > Hi, > > cool release! I've updated beignet to 1.0.0 for Fedora 20, 21 and Rawhide (22) > today. Let's use it! > > https://bugzilla.redhat.com/show_bug.cgi?id=1142892 > > On Fri, Nov 14, 2014 at 9:09 AM, Zhigang Gong <[email protected]> > wrote: > > Beignet 1.0.0 (2014-11-14) > > ========================= > > > > Beignet development team is proud to announce that Beignet 1.0.0 has > > been released. This is an important milestone after about two years of > > development. Thanks for everyone who helped us to improve it to > > relatively mature state. > > > > Now beignet supports from 3rd to 5th Generation Intel Core Processors. > > Besides the Broadwell support, this release also bring major > > performance improvement for many workloads and fixed some bugs. We > > observed 10% to more than 4x performance gain for some OpenCV 3.0 > benchmarks. > > > > The highlighted items are as below: > > > > 1. Added 5th generation Intel Core Processors (BDW) support. > > 2. Optimized constant buffer load. > > 3. Implement basic transformation from unstructurized control flow to > > structurized control flow to improve performance. > > 4. Fixed some memory leak bugs. > > 5. Implemented missing constant expression handling. > > 6. Added Clang/ICC compiler support for Beignet build. > > 7. Optimized unaligned char/short vector load. > > 8. Speed up kernel compiling time by move built-in functions support > > from header file into linked library. > > 9. Implemented some missing llvm intrinsics. > > 10. Optimized loop unrolling pass, boosted some OpenCV benchmarks. > > 11. Several other bug fixes since last release. For OpenCV 3.0 / > > OpenCV 2.4/piglit test suite, Beignet's pass rates are all > > above 99%. > > > > Git tag: Release_v1.0.0 > > Gitweb URL: http://cgit.freedesktop.org/beignet > > https://01.org/sites/default/files/beignet-1.0.0-source.tar.gz > > > > md5sum: bfd755904c332cdd285d6058f5f3de8c Beignet-1.0.0-Source.tar.gz > > sha1sum: a2b0eb53e5f9a6055cd656531532a4c6ae03fbb0 > > Beignet-1.0.0-Source.tar.gz > > sha256sum: > > e30c4d0f4c8917fa0df2467b2d70a4ee524f28d54c42c582262d5f08928ea543 > > Beignet-1.0.0-Source.tar.gz > > > > ----------------------------------------------------------------- > > > > Changes since 0.9.3: > > > > Andreas Beckmann (2): > > fix some typos > > use env to set environment variables for GBE_BIN_GENERATER > > > > Chuanbo Weng (1): > > utest: add new test that trigger an assignment operation bug in if. > > > > Guo Yejun (18): > > remove requirment as drm master in non-x environment > > remove requirment as drm master in non-x environment > > free build_log when the cl program is released > > free build_log when the cl program is released > > fix three memory leaks > > clean llvm resource in compiler (libgbe.so) > > fix three memory leaks > > clean llvm resource in compiler (libgbe.so) > > delete GEPInst when it is no longer used > > delete GEPInst when it is no longer used > > remove dependency for non-X runtime environment > > remove dependency for non-X runtime environment > > support CL_MEM_USE_HOST_PTR with userptr for cl buffer > > enable CL_DEVICE_HOST_UNIFIED_MEMORY when userptr is > supported > > add test for cl buffer created with CL_MEM_USE_HOST_PTR > > fix issue to create cl image from libva with non-zero offset > > add test for clCreateImageFromLibvaIntel > > use posix_memalign instead of aligned_alloc to be more > > compatible > > > > Junyan He (54): > > Fix the global string bug for printf. > > Fix a bug for runtime_barrier_list.cpp, event array out of bound > > Fix a bug for runtime_barrier_list.cpp, event array out of bound > > Fix the global string bug for printf. > > Add common define header files to initialize the libocl > > Add the async module into the libocl > > Add the atomic module into the libocl > > Add the geometric module into the libocl > > Add the image module into the libocl > > Add the misc module into the libocl > > Add the sync module into the libocl > > Add printf module into libocl > > Add vload module into the libocl > > Add thw workitem module into the libocl > > Add the convert and as modules into the libocl > > Add the gen_vector script into the libocl > > Add the common module into the libocl as template > > Add the integer module into libocl as template > > Add the math function into libocl as template > > Add the relational module into libocl as template > > Add the ocl_defines header file into libocl > > Add memcpy, memset and barrier bitcode files into libocl > > Add the bit code linker into the module pass. > > Enable libocl and disable the usage of the old huge header. > > Use the PCH to accelerate the parsing speed of the ocl.h > > Delete all the unused files of old huge header. > > Add the missing function prototypes of any() and atom_add() > > Add uncompatible PCH Options to avoid compiling failure. > > Fix the global string bug for printf. > > Add copyright header for all libocl files. > > Fix the issue of -cl-std=CLX.X option. > > Fix the issue of -cl-std=CLX.X option. > > Add the switch logic for math conformance fast path > > Modify the CMakeList to use the internal PCH first. > > Fix the bug of LLVM_LFLAGS fail to set > > Add long support for printf > > BDW: Add gen8 surface state struct. > > BDW: refine the gen8_surface_state_t. > > BDW: Add function intel_gpgpu_setup_bti for gen8. > > BDW: Correct surface base address set in setup bti. > > BDW: Add function intel_gpgpu_bind_buf for gen8. > > Add sampler state and tile define for gen8. > > Modify the bind sampler logic for gen8 > > BDW: Add gen8 into intel_driver_init > > Refine the shared function ID define. > > Add the libdrm version check. > > Let the failure of intel_drm lib's check as a FATAL_ERROR > > Fit the printf bug in loop > > Fix the bug of 1D array slice pitch > > Add the test case for image 1d array fill > > Add the test case for image 2d array fill > > Add the disasm support for Gen8 > > Fix the compare_image_2d_and_1d_array test case bug > > Fix the bug of multi-thread crash > > > > Luo (5): > > remove lspci, gbe_bin_genenrater would generator llvm binary by > default. > > remove lspci, gbe_bin_genenrater would generator llvm binary by > default. > > fix piglit get kernel info FUNCTION ATTRIBUTE fail. > > fix piglit get kernel info FUNCTION ATTRIBUTE fail. > > add opencl-1.2 builtin function popcount. > > > > Luo Xionghu (28): > > fix the relational built-in vector function regression. > > fix opencv_test_imgproc subcase OCL_ImgProc/Accumulate.Mask > regression. > > fix piglit cl-api-get-program-info fail. > > fix piglit cl-api-get-program-info fail. > > fix clGetKernelWorkGroupInfo built-in kernel fail. > > fix piglit cl-api-set-kernel-arg fail. > > fix clGetKernelWorkGroupInfo built-in kernel fail. > > fix piglit cl-api-set-kernel-arg fail. > > fix bin/cl-program-tester tests/cl/program/execute/attributes.cl > regression. > > fix bin/cl-program-tester tests/cl/program/execute/attributes.cl > regression. > > remove the LinkOnceAnyLinkage since the libocl is introduced. > > improve the build performance of vector type built-in function. > > fix one bug at cl_get_kernel_workgroup_info. > > fix utest memory leak. > > Add Gen IR WHILE. > > add handleSelfLoopNode to insert while instruction on Gen IR level. > > Use instruction WHILE to manipulate structure. > > add utest popcount for all types. > > use global flag 0.0 to control unstructured simple block. > > add llvm Intrinsic call support. > > add utest compiler_overflow for llvm intrinsic function. > > enable llvm intrinsic call usub_with_overflow funtion. > > add utest for llvm intrinsic call usub_with_overflow funtion. > > enable llvm intrinsic call bswap function. > > add utest function bswap. > > fix bswap kernel function type issue. > > fix piglit clCreateProgramWithBinary fail. > > fix a bug in clCompileProgram(). > > > > LuoXionghu (5): > > add platform info in the gen binary code. > > add utest load_program_from_gen_bin. > > add platform info in the gen binary code. > > add utest load_program_from_gen_bin. > > improve the build performance of vector type built-in function. > > > > Lv Meng (6): > > improve the clEnqueueCopyBufferRect performance in some cases > > Fix compile error for ICC compiler > > Fix compile errors for CLANG compiler > > Fix compile warnings for ICC compiler > > Fix compile warnings for CLANG compiler > > Enable ICC and CLANG compiler for beignet > > > > Meng Mengmeng (3): > > add beignet GIT_HAL1 if there is .git directory > > create GIT_SHA1 without any dependency > > add building dependency GIT_SHA1 > > > > Rebecca Palmer (7): > > Fail gracefully on unsupported hardware > > Fail gracefully on unsupported hardware > > GBE: fix bug in pow()/pown(). > > GBE: fix bug in erf()/erfc(). > > GBE: fix bug in tgamma(). > > utests: fix bugs in builtin_pow(). > > utests: fix bugs in builtin_tgamma(). > > > > Ruiling Song (43): > > GBE: Fix builtin tanpi. > > GBE: Fix builtin tanpi. > > GBE: Use varying register to save one instruction > > GBE: Optimize constant load with sampler. > > GBE: align the fields in union ImageInfoKey. > > utests: Fix a bug in image_1D_buffer. > > GBE: align the fields in union ImageInfoKey. > > utests: Fix a bug in image_1D_buffer. > > runtime: set correct state for constant buffer on hsw. > > runtime: set correct state for constant buffer on hsw. > > GBE: Refine bti usage in backend & runtime. > > GBE: Handle bti allocation for internal buffer used by printf. > > GBE: remove some useless code for getting printf buffer address. > > GBE: Fix a warning in getConstantPointerRegister. > > GBE: Fix type size for vector3 > > GBE: initialize BTI structure to zero. > > GBE: Fix a bug in gatherBTI. > > cmake: Fix a license issue. > > GBE: clear deadprintfs when current function is done. > > GBE: refine the llvm multi-thread related code. > > GBE: Fix type size for vector3 > > cmake: Fix a license issue. > > GBE: clear deadprintfs when current function is done. > > GBE: refine the llvm multi-thread related code. > > GBE: Optimize constant load with sampler. > > GBE: Refine bti usage in backend & runtime. > > GBE: Handle bti allocation for internal buffer used by printf. > > GBE: initialize BTI structure to zero. > > GBE: Fix a bug in gatherBTI. > > GBE/libocl: Fix sub_sat corner case. > > GBE: Fix sub_sat corner case. > > GBE: Output linkModules's error message. > > GBE/libocl: Add __gen_ocl_get_timestamp() to get timestamp. > > GBE: Fix a bug when setting flag register > > GBE: add legalize pass to handle wide integers > > Re-apply "improve the build performance of vector type built-in > function." > > GBE: workaround register allocation fail caused by custom loop > unroll. > > GBE: Fix live range for temporary register in replaceReg > > GBE: Fix kernel argument size for vector3 > > utests: add a test to trigger cl_float3 bug in clSetKernelArg. > > GBE: Fix a bitcast from float vector to wide interger issue in > > legalize > pass. > > GBE: Do topological sorting of basicblocks. > > docs: update mixed_buffer_pointer document. > > > > Yang Rong (54): > > Add some hsw missed pci ids (reserved PCI IDs). > > Add some hsw missed pci ids (reserved PCI IDs). > > Fix a utest compiler_async_stride_copy typo. > > Fix a utest compiler_async_stride_copy typo. > > Only compiler X11 files and do X11 operations when found X11. > > Only compiler X11 files and do X11 operations when found X11. > > Update Beignet.mdwn X11 dependency. > > Two minor fix. > > Fix two bugs. > > Update Beignet.mdwn X11 dependency. > > Two minor fix. > > Fix two bugs. > > Update README for the command parser in drm kernel. > > Update README for the command parser in drm kernel. > > Update license disclaimer. > > Update license disclaimer. > > Avoid use GenNativeInstruction directly out of GenEncode and > gen_insn_compact. > > BDW: Add BDW pci ids and BDW device struct. > > BDW: Add BDW instruction define. > > BDW: Add Gen8Encoder and Gen7Encoder. > > BDW: Add class Gen8Context. > > BDW: Pass Jip and Uip when patchJMPI. > > BDW: Refine intel_gpgpu_setup_bti and add > intel_gpgpu_set_base_address for BDW. > > BDW: add some BDW function. > > BDW: Fix Pointer argument curbe alloce size. > > BDW: enable SLM in BDW. > > BDW: Fix unsample bug. > > BDW: Refine BDW's int 32*32 multiply. > > BDW: BDW don't need add slm offset, remove it. > > BDW: Add BDW Device id to gen binary generater and binary serialize > in backend. > > BDW: Add device's sub slice field, for cl_get_kernel_max_wg_sz. > > BDW: Correct scratch buffer of BDW. > > BDW: Forgot to set UIP of else in BDW. > > BDW: Correct BDW device name. > > BDW: Fix a scaler int 32*32 bug. > > BDW: Need not restore SLM setting in BDW. > > BDW: Correct stack setting in BDW. > > Fix a segment fault. > > Fix a HSW regression. > > Fix memcpy and memset bug. > > Fix HSW thread_n <= 64 assert. > > Fix a HSW constant buffer regression. > > BDW: Change BDW's max work group size to 512. > > BDW: Fix load/store half error. > > BDW: Also need set Shader Channel Select for constant buffer in > BDW. > > Fix a upsample regression. > > Fix a HSW regression. > > Refine the the error handling in function > cl_command_queue_ND_range_gen7. > > Refine the intel gpgpu delete. > > Fix a size assert when setup bti. > > BDW: Fix bwd 32*32 scalar multiplication bug. > > IVB/HSW/BYT: Revert the Dynamic state Base Addr and relative > buffers address setting. > > BDW: Set the URB/REST size to 384K/384K when SLM disable. > > BDW: Change the default tiling mode to TILING_Y on BDW. > > > > Yichao Yu (1): > > Use ${PYTHON_EXECUTABLE} to run python scripts. > > > > Yongjia Zhang (6): > > Add Gen IR IF, ELSE and ENDIF > > Add Gen instruction 'else' > > Add structure identification on ir level > > Use instruction if else and endif manipulate structures > > Enable structural analysis > > GBE: fix empty block disassemble bug. > > > > Zhenyu Wang (5): > > Make use of write enable flag for mem bo map > > Clear batch buffer pointer after unmap > > Use pread/pwrite for buffer enqueue read/write > > Fix AUX buffer for page alignment > > Remove intel_gpgpu_check_binded_buf_address() > > > > Zhigang Gong (111): > > Build: Change versioning policy. > > runtime/driver: refine error handlings. > > runtime: fix some subtle event bugs. > > runtime/driver: refine error handlings. > > runtime: fix some subtle event bugs. > > gbe: add the new else instruction to the assert checking. > > docs: add a NEWS document to point to the release notes pages. > > docs: add a NEWS document to point to the release notes pages. > > Bump to 0.9.2. > > NEWS: update for 0.9.2. > > GBE: cleanup image base index related code. > > GBE: refine post register allocation scheduling for global buffers. > > GBE: refactor the immediate class to support vector data type. > > GBE: simplify processConstant. > > GBE: complete constant expression processing. > > GBE: enable constant expression processing. > > utest: add new test for constant expression processing. > > GBE: Reduce random behaviour of the code generation > > GBE: adjust preferred vector length. > > GBE: refactor the immediate class to support vector data type. > > GBE: simplify processConstant. > > GBE: complete constant expression processing. > > GBE: enable constant expression processing. > > utest: add new test for constant expression processing. > > Revert "GBE: refine post register allocation scheduling for global > buffers." > > utests: fix two utest bugs. > > GBE: fix error in the rootn fastpath function for some special input. > > utests: fix two utest bugs. > > GBE: fix error in the rootn fastpath function for some special input. > > Add new vload benchmark/test case. > > GBE: optimize unaligned char and short data vector's load. > > GBE: relax the batch byte/short load vector size restrication. > > GBE: refine the unaligned data gathering. > > GBE: adjust preferred vector length. > > GBE: fixup/refine a bug for image1D array's extra binding index > handling. > > GBE: remove the user defined macro cl_khr_fp64. > > GBE: avoid one optimization pass to generate wide integer. > > GBE: avoid one optimization pass to generate wide integer. > > GBE: fix a bug with LLVM 3.3. > > GBE: fallback if we get a wider than i64 constant. > > GBE: fix a bug with LLVM 3.3. > > GBE: fallback if we get a wider than i64 constant. > > GBE: cleanup image base index related code. > > GBE: fixup/refine a bug for image1D array's extra binding index > handling. > > build: fix a CXXFLAGS override bug in backend directory. > > GBE: fix some predfeined OCL macros. > > Runtime: Implement clGetExtensionFunctionAddressForPlatform. > > Runtime: Implement clGetExtensionFunctionAddressForPlatform. > > GBE/libocl: fix the wrong prototype of scalar native_powr. > > GBE: fix bugs when handling -cl-std option. > > GBE: fix bugs when handling -cl-std option. > > GBE/libocl: Added one missing prototype fma(). > > GBE: don't return error if we get an empty module. > > GBE: Fix a potential segfault. > > GBE: Fix a potential segfault. > > GBE: fix a potential memory leak bug. > > GBE: fix a potential memory leak bug. > > GBE: don't enable double by default. > > GBE: don't enable double by default. > > GBE: fix multiple files compilation bugs. > > runtime: fix program binary type bug. > > runtime: fix build status handling. > > runtime: fix program binary type bug. > > runtime: fix build status handling. > > GBE: fix multiple files compilation bugs. > > Update readme. > > Update readme. > > Document fixup. > > Remove out-of-date document. > > Bump to 0.9.3. > > Remove out-of-date document. > > Update NEWS. > > GBE/libocl: add missing vector builtin definition for fma. > > GBE/libocl: fix a regression after libocl change. > > Revert "improve the build performance of vector type built-in > function." > > GBE/libocl: fix build dependency issue. > > GBE: fix a loop header file including bug. > > GBE: structurized loop exit need an extra branching instruction when > do reordering. > > GBE: fix a bug in legalize pass. > > GBE: do intrinsics lowering pass earlier. > > GBE: fix a legalize pass bug when bitcast wide integer to > incompaitble vector. > > GBE: Add a customized loop unrolling handling mechanism. > > GBE: disable custom loop unroll for LLVM 3.3/3.4. > > GBE: add Selection instruction handler at legalize pass. > > GBE: increase maximum src/dst operands to 32. > > GBE: add basic PHINode support in legalize pass. > > GBE: fix regression caused by simple block optimization. > > GBE: handle dead loop BBs in liveness analysis. > > GBE: set default address space to -1 to avoid incorrect unroll hint. > > GBE: fix a wrong type of cl_device_info. > > utest: change the box_blur_image to be identical to box_blur. > > utests: replace the nodistriutable picture. > > GBE: fix disassembly bug. > > GBE: fix a bool handling bug when SEL on a uniform bool variable. > > GBE: Support more instructions for constant expression handling. > > GBE: remove useless debug info. > > Revert "add test for clCreateImageFromLibvaIntel" > > Revert "fix issue to create cl image from libva with non-zero offset" > > utests: remove all shader toy test cases. > > License: adjust all license version to LGPL v2.1+. > > GBE: fix relocatable issue for pch file. > > Revert "BDW: Change the default tiling mode to TILING_Y on BDW." > > GBE: fix one double related bugs for post register scheduling. > > update some documents. > > runtime: fix one bug in BDW image. > > Update documents. > > runtime: refine version handling. > > runtime: fix bug in cl_enqueue_read_buffer. > > runtime: disable userptr due to random fail. > > GBE: work around error reporting for unresolved symbols > > Bump to 1.0.0. > > > > > > -- > > Zhigang Gong, > > Thanks. > > _______________________________________________ > > Beignet mailing list > > [email protected] > > http://lists.freedesktop.org/mailman/listinfo/beignet > > > > -- > -Igor Gnatenko > _______________________________________________ > Beignet mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/beignet _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
